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Abstract. A very useful fact in additive combinatorics is that analytic expressions that 
can be used to count the number of structures of various kinds in subsets of Abelian 
groups are robust under quasirandom perturbations, and moreover that quasirandomncss 
can often be measured by means of certain easily described norms, known as uniformity 
norms. However, determining which uniformity norms work for which structures turns 
out to be a surprisingly hard question. In [GW09a] and jGW09b( IGW09cj we gave a 
complete answer to this question for groups of the form G — F™, provided p is not too 
small. In Zjv, substantial extra difficulties arise, of which the most important is that an 
"inverse theorem" even for the uniformity norm \\.\\u 3 requires a more sophisticated (local) 
formulation. When N is prime, Z^v is not rich in subgroups, so one must use regular Bohr 
neighbourhoods instead. In this paper, we prove the first non-trivial case of the main 
conjecture from [GW09a| . 
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1. Introduction 

In additive combinatorics one is often interested in counting small structures in subsets 
of Abelian groups. For instance, one formulation of Szemeredi's theorem is the assertion 
that if 5 > 0, k is a positive integer and N is large enough, then every subset A of Zjv of 
density at least 5 contains many arithmetic progressions of length k. 

There is also an equivalent formulation of the theorem concerning functions, the formal 
statement of which is as follows. 

Theorem 1.1. Let 5 > and let k be a positive integer. Then there is a constant c = 
c(S, k) > such that for every N and every function f : Z N — > [0, 1] with E x f(x) > 5, 

E x4 f(x)f(x + d) ... f(x + (k- l)d) > c. 

Here we use the symbol "E" to denote averages over Zjy- For instance, E Z)C j is shorthand 
for N~ 2 Ylxdez ■ ^ we take / ^° be the characteristic function of a set A of density 5, 
then the conclusion of Theorem 11.11 states that the number of arithmetic progressions of 
length k in A, or rather the number of pairs (x, d) such that x, x + d, . . . , x + (k — l)d all 
lie in A, is at least c(S,k)N 2 . (It is not necessary to assume that N is sufficiently large, 
because for small iV the degenerate progressions where d = are numerous enough for the 
theorem to be true. But iV has to be large for it to become a non-trivial statement.) 

There are now several known ways of proving Szemeredi's theorem. One of them, an 
analytic approach due to the first author |G01| . relies heavily on the fact that the quantity 
E Xi df(x)f(x+d) . . . f(x+(k — l)d) is robust under perturbations of / that are quasirandom 
in a suitable sense. More precisely, in |G01j a norm \\.\\uk was defined for each k, which 
has the property that if /i, . . . , fk are functions with ||/i||oo < 1 for every i, then 

(1) K,dfi{x)f 2 {x + d) . . . f k (x + (k — l)d)\ < min \\fi\\ uk -i 

i 

From this it is simple to deduce that if / and g are two functions from Z^r to [0,1], then 

E X!d f(x)f(x + d) ... f(x + (k — l)d) - E x4 g(x)g(x + d) . . . g(x + (k - l)d) 

has magnitude at most k\\f — g\\yk-i. If we choose a function h randomly by taking the 
values h(x) to be bounded random variables of mean 0, then with high probability 
will be very small. Thus, the U k norms are measures of a certain kind of quasirandomness 
that is connected with cancellations in expressions such as E, x ^h(x)h(x + d) . . . h{x + (k — 
l)d). 
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The proof of inequality (JTJ is a relatively straightforward inductive argument that in- 
volves repeated application of the Cauchy-Schwarz inequality. Once one has this argu- 
ment, it is natural to try to generalize it to other expressions such as ¥i Xjy>z f(x + y)f(x + 
z)f(y + z) or ~E x ,df{d)f(x)f(x + d)f(x + 2d). In general, one can take a system of linear 
forms Li, . . . ,L r in k variables x±, . . . , x s (that is, for each % we write x for [x,\, . . . , x s ) 
and define Li(x) = J2j=i a ij x j f° r some integers an, . . . , a^) and examine the quantity 
E x Yll =1 f(Li(x)), or the more general quantity 111= 1 fi(^i( x ))- We would then like to 
know for which uniformity norms U k it is true that these expressions are robust under 
small U k perturbations. 

This question was first addressed by Green and Tao |GrT06j . who were interested in 
proving asymptotic estimates for expressions such as these when / is the characteristic 
function of the primes up to n (or rather the closely related von Mangoldt function). They 
defined a notion of complexity for a system of linear forms. This is a positive integer k with 
the property that if a system has complexity k, then the corresponding analytic expression 
will be robust under small U k+l perturbations. (As we shall see, there are good reasons 
for defining complexity in a way that leads to this difference of 1.) Roughly speaking, the 
property they identified picks out the minimal k for which repeated use of the Cauchy- 
Schwarz inequality can be used to prove robustness under small U k+l perturbations. 

However, it turns out that there are some systems of linear forms of complexity k that 
are robust under small perturbations in £P' +1 for some j < k. (Since the U k norms 
increase as k increases, the assumption that a function is small in is weaker than 

the assumption that it is small in U k+1 .) This phenomenon was first demonstrated in 
|GW09a] . where we showed that if G is the group F™, then there is a system of linear forms 
of complexity 2 such that the corresponding analytic expression is robust under small U 2 
perturbations. (A similar phenomenon in ergodic theory was discovered independently 
by Leibman |L07J .) Because Green and Tao's definition, appropriately modified, appears 
to capture all systems of linear forms for which Cauchy-Schwarz-type arguments work 
(though we have not actually formulated and proved a statement along these lines), one 
must use additional tools. The particular tool we used was a new technique known as 
quadratic Fourier analysis, which we shall discuss in some detail in §3j A weak "local" 
form of quadratic Fourier analysis was introduced and used in |G01] to prove Szemeredi's 
theorem (for progressions of length 4 - higher order Fourier analysis was needed for the 
general case). A more "global" version was developed by Green and Tao |GrT08] and will 
be essential to this paper. 
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In |GW09aj we made the following conjecture. 

Conjecture 1.2. Let L\, . . . , L r be a system of linear forms in x = (x±, . . . , x s ) and let G 

be the group Z^r or F™ for sufficiently large p. Suppose also that the kth powers of the forms 
Li are linearly independent. Then K x Y[ i= i f{Li{x)) is close to E x n[=i <?(-^i( x )) whenever 
f and g are bounded functions and \\f — g\\uk is small. 

It is not hard to prove the converse of this conjecture, so if it is true then it identifies 
precisely the minimal uniformity norm with respect to which the multilinear expression 
derived from the linear forms is continuous (where by "continuous" we mean continuous 
in a way that does not depend on the size of the group): it is given by the smallest k such 
that the kth powers of the linear forms are linearly independent. In such a case, we shall 
say that the forms are kth-power independent. When k = 2 we shall say that they are 
square independent. We also formulated a more general conjecture that covers the case of 
m different functions. 

In |GW09c] we proved Conjecture 11.21 in the case where G = F™, using the very recent 
inverse theorem for the U k norm in that context, which was proved by Bergelson, Tao 
and Ziegler |BTZ09t ITZ08] . This inverse theorem opens the way to cubic Fourier analysis, 
quartic Fourier analysis, and so on. Our result is the first application of this higher-order 
Fourier analysis. The first application of higher-order Fourier analysis on Z^v, which has 
recently become a theorem (though so far only the k = 4 case is available |GrTZ09] ). is 
to linear equations in the primes: Green and Tao have already obtained asymptotics for 
the numbers of solutions for all systems of finite complexity, conditional on the inverse 
conjecture for the U k norm in Z^v, which they have now proved with Ziegler. 

Quadratic Fourier analysis on Zjv has had other applications. For example, a modifi- 
cation of Theorem 17.51 was used by Candela |C08] to prove that if A is a dense subset of 
{1,2, ... ,n} then the set of all d such that A contains an arithmetic progression of length 
3 and common difference d must itself contain an arithmetic progression of length at least 
(log logiV) c . 

In |GW09a] we proved the first non-trivial case of Conjecture [L2] for F™, which is the case 
of square-independent systems of complexity 2. However, we obtained a bound of tower 
type, so from a quantitative point of view this result was not very satisfactory. In |GW09bj . 
we improved this bound to one that was doubly exponential. The general inverse theorem 
for functions on F™ has so far been proved only as a purely qualitative statement, so we 
did not obtain any bounds at all for the other cases of Conjecture 11.21 In this paper, we 
shall prove Conjecture II .21 for square independent systems of complexity 2 in Zjy. In other 
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words, if / and g are bounded functions and Li, . . . , L r is a square independent system of 
complexity 2, we are interested in how small the U 2 norm ||/ — g\\u 2 has to be to guarantee 
that Ea; nj=i f(Li(x)) is within e of E x XYi=i9{Li{x)). We go to considerable efforts to 
obtain a respectable bound, which in the end is a doubly exponential dependence on e. If 
we had worked less hard then we would have had to settle for a tower-type bound. To 
obtain the good bound (relatively speaking) we shall use some of the ideas from |GW09bj 
as well as some new ideas to deal with problems that do not arise in F™. 

The big difference between F™ and Z N is that F™ has many subgroups that closely 
resemble F™ itself. If N is prime, then Zjy has no non-trivial subgroups at all, and it 
becomes necessary to consider subsets that are "approximately closed" under addition. 
These subsets are called regular Bohr sets, and we shall discuss them in the next section. 
Here we remark that the notion of a Bohr set originated in the study of almost periodic 
functions and has played a very important role in additive combinatorics since Ruzsa's 
pioneering proof |R94] of Freiman's theorem |F73] . The additional hypothesis of regularity, 
which makes it possible to treat Bohr sets like subgroups, was introduced by Bourgain |B99] 
and has subsequently been used by several authors. 

By proving our results first for F™ and then adapting the arguments to the 1, N context, 
we are following a general course urged by Green in |Gr07j . The reason for doing it is that 
it splits problems into two parts. The first part, which is in a sense more fundamental, is 
to get one's result in a model context where certain distracting technicalities do not arise. 
Once one has done that, one has a global structure for the proof, and one can usually find 
a proof in Ztv that has the same global structure as the proof in F™. 

That is the case for our result, so although we have made this paper self-contained, the 
reader will almost certainly prefer to begin by reading |GW09bj . However, the adaptation 
of our arguments to Z^ is by no means a completely mechanical process. Some parts 
are, by now, fairly routine, but certain concepts that are quite useful for proving results 
in Fp do not have obvious analogues in Z^v, and some lemmas that are almost trivial in 
F" become serious statements with non-obvious proofs in Z^v- We shall highlight the less 
obvious parts of the adaptation as they arise, since some of them may well find other uses. 

Very recently indeed, Green and Tao |GrT10] have proved Conjecture 11.21 in full gen- 
erality in Zat, using the recent inverse theorem. Their method is completely different 
from ours, which was almost certainly necessary: it seems that they have found the right 
framework for studying the problem if one is content with arguments that do not give 
reasonable bounds. However, in order to obtain the quantitative statement that we prove 
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here, it seems to be necessary (at least given the technology as it is at present) to use dif- 
ferent, more "old-fashioned" techniques. For the time being a proof of the full conjecture 
with good bounds looks out of reach: not the least of the difficulties would be obtaining a 
quantitative version of the inverse theorem. 

2. Bohr sets and their basic properties 

Let K be a subset of Z N and let p > 0. The Bohr set B(K,p) is the set of all x G TL^ 
such that \i>f x — 1| < 5 for every r G K. As we have just said, Bohr sets will play the role 
that subgroups played for functions defined on F™. However, they are not closed under 
addition, and this causes problems. 

The way to deal with these problems is to use the fact that Bohr sets do have at least some 
closure properties. In particular, if x G B(K, p) and y G B (K, <r), then x + y G B(K, p + a). 
To use this fact, one takes a small enough for B(K,p + a) to be approximately equal to 
B(K,p). 

However, such an approach can work only if the size of the set B(K,p) depends suffi- 
ciently continuously on p, which is not always the case. This fact motivated an important 
definition due to Bourgain [B99j . Let B = B(K,p) be a Bohr set. B is said to be regular 
if, for every e > 0, the Bohr set B(K, p(l + e)) has cardinality at most \B\(1 + 100|K|e) and 
the Bohr set B(K,p(l — e)) has cardinality at least |-B|(1 — 100|fT|e). The precise form of 
this definition is what comes out of the following lemma (see for example |TV06] ). which 
tells us that it is easy to find regular Bohr sets. 

Lemma 2.1. Let K be a subset of Z^r and let po > 0. Then there exists p such that 
p G [po,2po] and the Bohr set B(K,p) is regular. 

It will be useful to have a concise notation that allows us to talk about pairs of Bohr 
sets that have the approximate closure property under addition. 

Definition. Let B be a regular Bohr set B(K,p). Then we say that a subset B' C B is 
e-central for B, and write B' -< e B, if B' = B(K,a) for some a G [ep/A00\K\ 1 ep/200\K\] 
and B' is also regular. Given a pair of Bohr sets B' -< e B, we define the closure of B to 
be the set B + = B(K, p + a) and the interior to be the set B~ = B(K, p — a). 

The definitions of closure and interior depend on the central set B', so they cannot be used 
unless B' has been specified. But this does not cause any problems. 

Because we are dealing with quadratic rather than linear local Fourier analysis, we will 
sometimes have to repeat the closure and interior operations, which, unlike their topological 
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counterparts, are not idempotent (and therefore not strictly speaking closure and interior 
operations at all). Thus, we define B ++ to be B(K, p + 2a) and B to be B(K, p — 2a). 

Note that in many of the early lemmas we do not actually need the central Bohr set B' 
to be regular. However, we often apply a sequence of such lemmas, so it is convenient to 
insist on regularity at all times. 

There are many closely related ways of using the regularity condition on a Bohr set. 
The next lemma, which will be used later, is a typical one. It exploits the fact that regular 
Bohr sets have "small boundaries" . 

Lemma 2.2. Let K be a subset of and let be a sequence of m elements of 

TLfq. Suppose that the Bohr sets B and B' satisfy B' -< e B. Then for all but at most em\B\ 
values of x the following statement is true: for every i, B' + x is either contained in B + xi 
or disjoint from it. 

Proof. If x — Xi G B~ , then B' + subset of B + B' C B, and therefore 

B' + x C B + Xi. Similarly, but in the other direction, if (B' + x) D {B + Xi) ^ 0, then 
x — Xi G B + . Therefore, the only way that B' + x can fail to be either contained in B + X{ 
or disjoint from it is if x — G B + \ B~ . However, by the definition of regularity, the 
cardinality of B + \ B~ is at most e\B\. The lemma follows. □ 

Another very useful principle indeed is that if B is a regular Bohr set, B' is a central sub- 
set, and / is a bounded function, then E xe s/(x) is approximately equal to M x( zB^yeB' f{x + 
y). Indeed, this is the most common way that regularity has been applied. We shall need 
some less standard (but not difficult) variants of this principle — for the convenience of the 
reader we give proofs of all the results of this kind that we need. We shall use the notation 
to stand for the relation "differs by at most e from" . 

Lemma 2.3. Let e > 0. Let B and B' be Bohr sets satisfying B' -< £ B. Then for every 
function f : Z^r — > C such that ||/||oo — 1 an d f or every function g : 7? N — > C such that 
IMIoo < 1 the following statements hold. 

(i) E xeB f(x) ^ e E xeB E ytEP f(x + y) for every subset P C B' . 

(li) E xeB f(x) w e E xeB -f(x). 

(in) E xeB -f(x) ^ 3t E xeB - E ytEB ,f(x + y). 

(iv) E XjX , eB g(x, x') ^ E X!X , eB E yeB ,g(x + y, x' + y) 

(v) E x>x , eB -g(x,x') ^ 8 e E x ^, £B -E y(iB ,g(x + y,x' + y). 



Proof. Since 
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< 1, for every y G B' we have the inequality 
\E xeB f(x + y)-E xeB f{x)\ < \B\- 1 \BA{B + y)\. 



But B A (B + y) C B + \ so the right hand side is at most e, by the regularity of B. 
Since E x< z B ~E ye p f (x + y) = E^pE^s/^ + y), part (i) follows from the triangle inequality 
To prove (ii), we begin by noting that 

E xeB f(x) - IS]- 1 f( x )\ < \ B \~ l \ B \ 
By regularity, the right hand side is at most e/2. It is also easy to check that 



E 



./(*)- Ifll- 1 ]T /(x) <e/2. 



It follows that |E x6 p/(x) — K xeB - f (x)\ < e. Applying (ii) to both sides of (i), we deduce 
(iii). The proof of (iv) is very similar to that of (i). For each y e B' we have the inequality 

K. x , eB g(x + y, x' + y) - E x , x , eB g(x, x')\ < \B\~ 2 \B 2 A (B + y) 2 \. 

From the fact that \B A (B + y)\ < e\B\ it follows that \B 2 A (B + y) 2 \ < Ae\B\ 2 . This 
implies (iv), just as the analogous statement implied (i). Finally, if we apply (ii) twice to 
both sides of (iv) we obtain (v). □ 



3. Quadratic Fourier analysis on Z w 

Conventional Fourier analysis on an Abelian group G decomposes a function / : G — > C 
into a linear combination of characters, which are homomorphisms from GtoT = 6 C : 
\z\ = 1}. If we allow ourselves a phase shift — that is, if we multiply a character by e ld for 
some 9 — then we obtain a function 7 that may not be a group homomorphism, but it is still 
a (multiplicative) Freiman homomorphism, since it satisfies the identity 7(2; + c/)7(x) _1 = 
7(2/ + <i)7(?/) -1 for every x, y and d in G. 

Quadratic Fourier analysis replaces Freiman homomorphisms by a natural quadratic 
analogue. We can restate the identity above as 7(27)7(0; + a) _1 7(x + b)~ l ^(x + a + b) = 1 
for every x, a and b in G. If A C G, then a function 7 : A — > T is a (multiplicative) 
quadratic homomorphism if 

7(^)7(0; + a) _1 7(x + 6) _1 7(x + c) _1 7(x + a + 6)7(2; + a + 0)7(2; + 6 + 0)7(2; + a + 6 + c)~ 1 = 1 

for every x, a, b and c in G. The word "quadratic" is used because if A = G = Z^, then 
7 has to be of the form 7(2;) = e 27r M x )/ N f or some quadratic function q : Zjy — > and 
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similar statements are true for several other groups. Because of this, we shall also refer 
to these functions as quadratic phase functions. For more general subsets A it is less easy 
to describe quadratic homomorphisms explicitly, but if A is a sufficiently structured set, 
such as a coset of a subgroup of F" (when p is not too small) or a Bohr set in Z^v, then for 
many purposes it is enough just to know that 7 is a quadratic homomorphism, though in 
these cases one can also give explicit descriptions and it is sometimes important to do so. 

The basic idea of quadratic Fourier analysis is that it is possible to decompose a func- 
tion into a linear combination of a small number of quadratic phase functions defined on 
regular Bohr sets, plus an error that does not affect calculations. One can of course do the 
same with conventional Fourier analysis simply by taking only the characters with large 
coefficients: however, there are circumstances where the error does affect calculations in 
the linear case, but does not in the quadratic case. 

A notable difference between linear and quadratic Fourier analysis is that there is not 
a unique way of decomposing a function into quadratic parts, for the simple reason that 
there are too many quadratic phase functions. Furthermore, there is not even a natural 
notion of the "best" decomposition. So instead one has to settle for decompositions that 
are somewhat arbitrary and try to control their properties. In order to get started, one 
needs an inverse theorem, which in our case is a statement to the effect that if ||,/||[/3 is 
not small (which is a way of saying that / is not already a "small error" ) then / correlates 
with a quadratic phase function. 

The following theorem to this effect was proved by Green and Tao [GrT08j . 

Theorem 3.1. Let f : Z^ — > C be a function such that \\f\\oo < 1 and \\f\\u 3 — and 
let C = 2 24 . Then there exists a regular Bohr set B = B(K,p) with \K\ < {2/5)° and 
p > (5/2) c such that E y \\f\\ uHB+y) > (5/2) c . 

Here, ||/|| U 3( B+J/ ) is defined to be the maximum correlation between / and any quadratic 
phase function 7 defined on B + y. More precisely, it is the maximum over all quadratic 
phase functions 7 from B + y to T of the quantity \E. x£ B+yf {x)^({x)^ l \. 

In their paper, Green and Tao remark that a slightly more precise theorem holds. The 
result as stated tells us that for each y we can find a quadratic phase function u Qy defined 
on B + y such that the average of \F, x€ B+yf(x)uJ Qy ^\ is at least (S/2) c . However, it is 
actually possible to do this in such a way that the "quadratic parts" of the quadratic 
phase functions q y are the same. That is, it can be done in such a way that each q y (x) has 
the form q(x — y) + <p y (x — y) for some (additive) quadratic homomorphism q : B — > Zjy 
(that is independent of y) and some Freiman homomorphism <fi y : B — > Z^r. 
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This will be convenient to us later, so we make the following definition, which is a 
modification of a definition given in |GW09c] for the F™ case. 

Definition. Let B be a regular Bohr set and let q be a quadratic map from B to Tn- A 
quadratic average with base (B, q) is a function of the form Q(x) = E y&x _ B oj qy<yX \ where 
each function q y is a quadratic map from B + y to Tn defined by a formula of the form 
q y {x) = q(x — y) + <fi y {x — y) for some Freiman homomorphism <p y : B — >■ Tn- 

An equivalent way of defining Q, which may be clearer, is to start by defining for each 
y £ Tin the function 7 y , which takes the value ui q y^ when x e B + y and otherwise. 
Then Q is Y2 y ly Thus, the value of Q at x is the average value of all the j y (x) such 
that x belongs to the support of j y . 

We can use the extra observation of Green and Tao to give a slightly more precise version 
of the inverse theorem. 

Theorem 3.2. Let f : Tn — > C be a function such that \\f\\<x> < 1 one? H/Ht/3 > 5, 
and let Co = 2 24 . Then there exists a regular Bohr set B(K,p) with \K\ < (2/5) Co and 
P — (^/2) c ' ; and a quadratic map q : B — > Tn, such that \(f,Q)\ > (5/2) Co /2 for some 
quadratic average Q with base (B,q). 

Proof. The results of Green and Tao tell us that we can find a regular Bohr set B = B(K, p), 
satisfying the above bounds, and a quadratic function q, and for each y we can find a 
Freiman homomorphism (p y : B — > Tn, such that, defining q y (x) = q(x — y) + 4> y (x — y) on 
B + y, we have 

E y \E xeB+y f(x)u-^\ > (5/2f 

For each function q y we can add a constant X y without affecting the left-hand side. If 
N > 3, as we are certainly assuming, then we can choose this constant so that 

K(E xeB+y f(x)cu-^ +x y) > ^\E x€B+y f(x)uj-^\ 

Therefore, after suitably redefining the functions q y and setting Q(x) = E yex ^ B tu qy( - x \ we 
have 

\(f,Q}\ > ^(E x E y£x . B f(x)u'^) > l -E y \E x&B+y f{x)u~^ x \ 
which proves the theorem. □ 

In the proof of the F" case, we defined quadratic averages in a similar way, but the 
role of Bohr sets was played by subgroups (or subspaces). This was simpler for several 
reasons. One reason was that translates of a subspace partition F™, but this turns out not 
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to be a significant complication of the Zjy case. More problematic is that we made some 
use of the fact that subspaces of F™ have a codimension, and it is not obvious what one 
would mean by the "codimension" of a Bohr set. To answer this question, we focus on 
the two main properties of codimension that we used for the F™ case: that a subspace of 
codimension d has density p~ d and that the intersection of subspaces of codimension d and 
d! has codimension at most d + d 1 . The analogous facts about Bohr neighbourhoods are 
that the intersection of the neighbourhoods B(K, p) and B(L, p) equals the neighbourhood 
B(KL)L, p), and that the density of B(K, p) is at least p'^L Thus, for fixed p the cardinality 
of K is a good analogue of the codimension. 

At first, this seems odd, since the cardinality of K is closely connected with the dimen- 
sion of B. However, it can also be seen as the number of inequalities that a point in B 
must satisfy, and these inequalities are analogous to the linear constraints that a point in 
a subspace must satisfy. Nevertheless, to avoid confusion we will not use the word "codi- 
mension" here. Instead, we shall define the complexity of the Bohr set B(K,p) to be the 
pair (\K\,p). Strictly speaking, this is not well-defined, since different pairs (K,p) can 
define the same Bohr set. So a slightly stricter definition is as follows: the Bohr set B has 
complexity at most (d, p) if there exists a set K of cardinality at most d and a constant 
p' > p such that B = B(K, p'). (We say "at most" because we regard a smaller p as giving 
a higher complexity.) We say that a quadratic average Q with base (B, q) has complexity 
at most (d, p) if B has complexity at most (d,p). 

Now, as we did in the F™ case, we can use fairly abstract reasoning to deduce some 
decomposition results from the inverse theorem. First, we recall a result from |G W09bj . 
It is a straightforward consequence of the Hahn-Banach theorem and appears in |GW09bj . 
with proof, as Corollary 2.4. It can be thought of as a general machine for converting 
inverse theorems into decomposition theorems. 

Proposition 3.3. Let k be a positive integer and for each i < k let be a norm defined 
on a subspace Vi of C n . Suppose also that V\ + ■ ■ ■ + Vj. — C n . Let a±, . . . , be positive 
real numbers, and suppose that it is not possible to write the function f as a linear sum 
fi + ■ ■ ■ + fk in- such a way that /, G V for each i and ai||/i||i + • • • + «fc||/fc||fc < 1. Then 
there exists a function G C™ such that \(f,(j>)\ > 1 and such that ||0||* < cti for every i. 

The final condition on means that \(g,<f))\ < cti for every % and every g G Vi with 

IMI<<i- 
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We now apply Proposition !3.3l to obtain a theorem that tells us that an arbitrary function 
/ that is bounded in L 2 can be decomposed as a linear combination of quadratic averages 
plus a small error. 

Theorem 3.4. Let f : Z^v — > C be a function such that \\fW2 < 1- Let Cq = 2 24 . Then for 
every 5 > and rj > there exist C , d and p such that f has a decomposition of the form 

f( x ) = $Z X iQi( x ) + 9(x) + h(x), 

i 

where the functions Qi are quadratic averages of complexity at most (d,p), and 

7 7 - 1 lb|| 1 + r 1 ||/ i || c ,3 + c- 1 5]|A i | < 1. 

i 

Moreover, we can take C = 4(2/r]5) c °, d = (2/5) c ° and p = (S/2) c °. 

Proof. For every quadratic average Q on Z N of complexity at most (d,p), let V(Q) be the 
one-dimensional subspace of C Zjv generated by Q, with the norm of XQ set to be |A|. Let 
a(Q) be C" 1 for every Q. In addition, let us take the Li norm and U 3 norm defined on 
all of C Zn and associate with them the constants rj and S, respectively. 

Suppose that / cannot be decomposed in the desired way. Applying Proposition 13.31 to 
the norms, subspaces and positive constants defined above, we obtain a function <fi : Z^v — > 
C such that (/, 0) > 1, < r/" 1 , ||0||^ 3 < 5' 1 and \(<j>,Q)\ < C' 1 for every quadratic 

average Q of complexity at most (d,p). 

Because ||/|| 2 < 1 and (/, <p) > 1, we find that (0,0) = ||0|| 2 > 1. But then ||0||[/3||0||^ 3 > 
1, which implies that ||0||(7 3 > 8. Applying Theorem 13.21 to rjcj), we obtain a quadratic aver- 
age Q of complexity at most (d, p) such that | (</>, Q) \ > (rjS/2) °/2, which is a contradiction 
since this inner product was supposed to be at most C~ l for all such quadratic averages. □ 

4. Generalized quadratic averages 

Before we go any further, we must address a technical issue that did not arise for F™. 
There, it is a triviality that if Q is a quadratic average with base (V, q) and Q' is a quadratic 
average with base (V, q'), then QQ' is a quadratic average with base (V PI V, q — q')- The 
analogous statement for Zjy is false, but an approximate version of it is true if we are 
prepared to generalize the notion of a quadratic average. 

In fact, we shall begin by discussing an even more basic statement, which again does not 
quite hold for Zat, namely the statement that if Q is a quadratic average with base (V, q) 
and V is a subspace of V, then Q is a quadratic average with base (V, q). 
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In order to obtain an analogue of this statement for Zyv, we first define a generalized 
quadratic average with base (B, q) to be any average of quadratic averages with base 
(B, q) — that is, any function of the form n~ 1 (Qi H — ■ + Qn), where n is a positive integer 
and each Qi is a quadratic average with base (£>, q). 

Lemma 4.1. Let e > 0, let the Bohr sets B and B' satisfy B' -< e B. Let q be a quadratic 
form on B and let Q be a generalized quadratic average with base (B,q). Then there is a 
generalized quadratic average Q' with base (B',q) such that \\Q — Q'\\oo < 4e. 

Proof. We begin by proving the result when Q is a quadratic average, defined by the 
formula Q(x) = ~E yex _BCO qy ( x \ Let A be the interior associated with the pair B' -< e B. 
From the proof of Lemma [2.31 (ii), we know that 

F r>L> q «^ Rj F 

l^yex-B^ ~e SL 'y£x-A UJ i 

so if we set R(x) to equal 'E, y€x -A.uj qv ^ x ' then \\R — Q\\oo < e - 

Next, we define S(x) to be E yex - A E ue - B >u q «+^ x l By Lemma E3 (in) , \\S - R\\oc < 3e. 
But S(x) is equal to E y£ _AE u€x _ B >uj qy+u ( x K For each y G —A and u G B', q y+u is a 
local quadratic that is defined everywhere on B + y + u, and hence on B' + u (since 
B' — y C B' + A C B). Therefore, since translating a local quadratic has the effect of 
adding a Freiman homomorphism, for each y G —A the function Q y (x) = E, u€x _B'UJ qy+u ^ 
is a quadratic average with base (£?', q). It follows that S is a generalized quadratic average 
with base (B',q). By the triangle inequality, \\Q — S\\oo < 4e. 

The result for generalized quadratic averages follows easily: one simply applies the result 
just proved to each individual quadratic average and uses the triangle inequality. □ 

Lemma 4.2. Let e > 0. Let be B\ and B2 be regular Bohr sets, let q\ and q2 be quadratic 
functions defined on them and let Qi and Q 2 be generalized quadratic averages with bases 
(Bi, qi) and (B2, q2), respectively. Suppose that the Bohr sets B and B' satisfy the relations 
B' -< t B -< 6 B\ H B2. Then there exists a generalized quadratic average Q' with base 
(B',qx - q 2 ) such that \\QiQ 2 - Q'\\oo < 18e. 

Proof. The argument is similar to the proof of the previous lemma, but slightly more 
complicated. First of all, by that lemma we can uniformly approximate both Q\ and Q 2 
to within 4e by generalized quadratic averages Q[ and Q' 2 with bases (B,qi) and (B,q 2 ), 
respectively. Suppose that Q\ = n~ 1 (Q i l + • • • + Qi >ni ). If we can find for each pair (r, s) 
a generalized quadratic average Q rs such that \\QirQ2s — Qrs\\ < 10e, then by taking the 
average over all r and s and applying the triangle inequality, we find that — Q'\\ < 
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10e, where Q' is the average of the Q rs . Thus, it is enough to prove the result when Q[ 
and Q 2 are non-generalized quadratic averages. 

Let us do this and let Q[ and Q' 2 be given by the formulae Q[(x) = ~E yex _BU qi - y ^ and 
Q' 2 (x) = E y( z x _ B uj g2 - v ^ x \ respectively. 

Now let us imitate the previous proof. Let B~ be the interior associated with the pair 
B' -< e B. Then, as we did for Q in the previous lemma, we can uniformly approximate 
Qi and Q' 2 to within e by quadratic averages R\ and R 2 that are given by the formulae 
R\(x) = E yex - B -oj qi > v ^ and R 2 (x) = ~K y€x _ B -tu q2 ' y( - x \ respectively. 

Let us examine the product R 1 (x)R 2 (x). It equals F lyjZex _ B -Lu qi - y( - x * > ~ q2 - z ( x \ which, by 
Lemma 12.31 (v) , differs by at most 8e from 

But the right-hand side is the formula for a generalized quadratic average with base (£>', q±— 
q 2 ), so we are done. □ 



5. The rank of a quadratic average 

Recall that so far we have shown how to decompose a function defined on Zjv into a 
linear combination of quadratic averages and an error that is small in a useful sense. As 
we did in |GW09b] in the proof for F™, we shall now collect these quadratic averages into 
well-correlating clusters. However, before we do so, we must think about another concept 
that is very convenient when discussing quadratic forms on F™ and that does not have an 
immediately obvious Zjv analogue, namely the rank of a form. 

Suppose that we have a quadratic form q defined on a subspace V of F™. Then a 
simple calculation shows that lE^yo; 9 ^! = p~ r ^ 2 , where r is the rank of the bilinear form 
(3(u,v) = (q(u + v ) — q{u) — q{y))/2 associated with q. Indeed, 

For each u, the expectation over v is unless (3(u, v ) = for every v, in which case it is 1. 
But the set of u such that (3(u, v) vanishes is a subspace of V of codimension r, so it has 
density p~ r , which proves the result. 

This calculation allowed us to argue as follows in |GW09bj . Given a quadratic form q, 
we looked at its rank r. If r was large, then q had a small average, whereas if r was small 
then q was constant on translates of a subspace of low codimension. This dichotomy played 
an important role in the proof, so we need to find an analogue for Zjv- 
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A close examination of the proof in |GW09bj shows that the main properties of rank 
that we used were that the rank of a sum of two quadratic forms is at most the sum of 
the ranks, and that if /3 is a bilinear form of rank r and and ip are linear functions, then 
E x>y u l3ix ' y)+ ^ x)+ ^ y) has size at most p~ r . 

The first of these properties looks very much like a fact of linear algebra, so it is tempting 
to try to develop an analogue of this linear algebra for Bohr sets. Unfortunately, although 
some analogues of linear algebra do exist, they are much less clean, and in any case they 
are completely inappropriate for our purposes, roughly speaking because the codimension 
of a subspace of F™ corresponds more to the dimension of a Bohr set in Ztv- For example, 

2 

if we are guided by linear algebra then we will be inclined to say that the function u x has 
rank 1, but in fact we want to count it as having very high rank because the expectation 
E x>y u 2xy is tiny. 

There is, however, a rather easy way to define an appropriate notion of rank in the 
Ztv context, which is to exploit the fact that there is a completely different alternative 
definition in F™. We observed above that |E,,. u q W\ is not just at most p r ' 2 but actually 
equal to p~ r ^ 2 . Therefore, we could, if we wanted, define the rank of q to be log p (a _1 ), 
where a = |E x o;' 3( ' :r - ) | 2 . And this gives us a definition that can be carried over much more 
easily to functions defined on Bohr sets in Zjy. (It can also be carried over much more 
easily to polynomial forms of higher degree and their associated multilinear forms. This 
was essential to us in [GW09c] .) 

We do of course pay a price for such a move. If we define rank in this way then it 
becomes true by definition that averages over quadratic phase functions of high rank are 
small. But we clearly cannot avoid doing any work: it is now not obvious that rank is 
subadditive or that a quadratic function of low rank has linear structure. In fact, neither 
of these statements is exactly true, but with some effort we will be able to prove usable 
approximations to them. 

One final remark is that it turns out to be more convenient to focus on bilinear forms 
rather than quadratic forms. On a subspace of F™ the two are basically equivalent, but on 
a Bohr set B in Z^ they no longer are, because q(a + b) — q(a) — q(b) is not defined for 
every a, b G B. We therefore have to look at a smaller structured set inside B. 

Here, then, is the definition that we shall use. Some of the features of the definition may 
look a bit strange, but they are chosen to make later proofs run more smoothly. 

Definition. Let B be a Bohr set and let q be a quadratic form on B. Let B' be a Bohr 
set such that 2B' — 2B' C B and let P be a subset of B' . The rank of the local quadratic 
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phase function h(x) = B(x)u q ^ relative to P is log(l/a), where a is the quantity 

If Q is a generalized quadratic average with base (B,q), then we define the rank of Q 
relative to P to be the rank of the local quadratic phase function B(x)cu q ^ relative to P. 

Note that if q and q' are two different quadratic functions defined on B, and if q — q' is 
a Freiman homomorphism, then any quadratic average with base (B, q) is also a quadratic 
average with base (B,q'). Therefore, one must check that the second definition above is 
well-defined. But this is easy, since if q — q' is the Freiman homomorphism 7, then 

q{a + b-a'- b') - q(a - a') - q{b - b') = q'{a + b-a'-b')- q'(a - a') - q'(b - b') + 7 (0). 

It follows that the expectation that defines the rank is unchanged in modulus. 

The next lemma tells us that the expectation of a generalized quadratic average with 
high rank is small. 

Lemma 5.1. Let < e < 1/20, let B and B' be Bohr sets satisfying B' -< e B and let 
P C B' . Let q be a quadratic form on B and let Q be a generalized quadratic average with 
base (B,q). Suppose that the rank of Q relative to P is r. Then \E x Q(x)\ < (lie + e" r ) 1 / 4 . 

Proof. Let h be a local quadratic phase function defined by a formula of the form h{x) = 
B(x-y)uj q{x - y)+ ^ x) for some Freiman homomorphism (so in particular h is supported in 
the Bohr neighbourhood B + y). Let us estimate \K xe B+yh(x)\, using Lemma [2.31 and the 
Cauchy-Schwarz inequality. Since is an arbitrary Freiman homomorphism, it is enough 
to do this when y = 0, so let us assume that that is the case. Recall that "^ e " stands for 
the relation "differs by at most e from" . 

By Lemma T2.3I (i) applied twice, we know that K x& Bh(x) ~2e ^xeB^a,beph{x + a + b). It 
is not hard to check that if e < 1/20 and a and (3 are complex numbers such that \a\ < 1 
and a ~2 e (3, then a 4 ~io e /3 4 - Therefore, since ||/i||oo < 1, 

\E xeB h(x)\ 4 ^ioe \E xeB E aibeP h(x + a + b)\ 4 
< E xeB \E a ^ P h(x + a + 6) | 4 . 
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Now let us look at the inner expectation when x G B . We have 
K,beP h (x + a + b)\ i < {E aeP \E beP h(x + a + b)\ 2 ) 2 



= (E fei6 / GP E aeP /i(x + a + b)h(x + a + b')) 2 
< E 6;6 / 6 p|E oeP /i(a; + a + b)h(x + a + b')\ 2 



= E a>a / e pE 6ib / e p/i(x + a + b)h(x + a + b')h(x + a' + + a + 6') 

which equals e _r by definition of the rank relative to P. The proportion of x that belong 
to B \ B is at most e, by regularity, and for these the inner expectation is at most 1. 
This proves that \E x&B h{x)\^ < lie + e~ r , and hence that \E xGB h(x)\ < (lie + e~ r ) 1//4 . 

Now let Q be a quadratic average given by a formula of the kind Q(x) = E y&x _BU qy ^ ■ 
Then ~E x Q(x) = EyE x€ B+yW qy ^ . By the estimate just established, this has absolute value 
at most (lie + e -7 ") 1 / 4 . Finally, this implies the same upper bound when Q is a generalized 
quadratic average. □ 



We remark that for the lemma just proved to be useful, one needs e to be comparable 
to or smaller than e~ r . This may seem to be quite a strong requirement, given that we 
also need B' -< e B. However, a recurring theme in this paper is that one can afford to take 
Bohr sets of small width: it is the dimension that one has to be careful about. So in fact 
the bound above is not too expensive for our later arguments to work. 

We shall now prove a more general result. The proof we give is in two senses not optimal. 
The first is that we obtain a bound that is weaker than it needs to be, because we estimate 
an £4 norm in terms of an norm. The second, more serious, is that we use Fourier 
analysis. The reason this is a defect is that it obscures the fact that the proof can be 
carried out in physical space and is therefore not hard to generalize. However, since in this 
paper we shall not be dealing with the cubic case for functions defined on Z^v, this is not 
enough of a defect to outweigh the advantage that the proof we give is very simple and 
does not involve technicalities concerning regular Bohr sets. 



Lemma 5.2. Let B and B' be Bohr sets satisfying B' -< e B and let P be a subset of B' . 
Let q be a quadratic form on B and let Q be a generalized quadratic average with base 
(B,q). Suppose that the rank of Q relative to P is r. Then \\Q\\u 2 < (lie + e~ r ) 1 / 8 . 
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Proof. For every u the function Q(x)u ux satisfies all the hypotheses of Lemma \5. 11 There- 
fore, by that lemma, \Q(u)\ < (lle + e~ r ) 1//4 for every u. Since ||Q||| = \\Q\\l < 1, it follows 
that \\Q\\i < (lie + e~ r ) 1/2 and hence that \\Q\\ V 2 < (lie + e~ r ) 1/8 . □ 

The next result expresses the idea that if two quadratic averages Q and Q' correlate 
well then they have a "low-rank difference" in the exponent. Very roughly speaking, this is 
because QQ' has large average, and is therefore a low-rank quadratic. Of course, this is not 
quite the correct argument, because Q and Q' are averages of quadratic phase functions 
defined on several different Bohr neighbourhoods. However, the basic idea is sound, as the 
next result shows. 

Corollary 5.3. Let B and B' be two arbitrary Bohr sets, let q and q' be quadratic forms on 
B and B' , and let Q and Q' be generalized quadratic averages with bases (B, q) and (B', q'). 
Let B\, B2 and B3 be Bohr sets satisfying the chain of relations B3 -< e B2 -< e B\ -< t BOB'. 
Let P be a subset of B s . Suppose that the rank of the function B2(x)u q ^~ q '^ relative to 
P is at least r. Then \ {Q,Q')\ < 18e + (lie + e^) 1 / 4 . 

Proof. By Lemma there is a generalized quadratic average Q" with base (B 2 , q — q') such 
that WQQ 7 - Q"||oo < 18e. By Lemma O and our hypothesis, \E x Q"(x)\ < (lie + e^') 1 / 4 . 
Since (Q, Q') = E x (Qg 7 )(x), the result follows. □ 

6. The structure of low-rank bilinear forms on Bohr sets 

Our next task is to understand the implications if the hypotheses of Corollary 15.31 do 
not hold. In the F" - case, we argued that if Q and Q' are quadratic averages and (Q, Q') 
is not small, then QQ' has low rank, from which it follows that QQ' is constant on cosets 
of a low- co dimensional subspace. From this we deduced that HQQ'Hj^ is not too large. 
In this paper, where Q and Q' are defined on a Bohr set B, we shall argue that QQ' 
is approximately constant on translates of a small (but not too small) multidimensional 
arithmetic progression P C B, and deduce that QQ' can be uniformly approximated by a 
function with smallish (U 2 )* norm. 

A similar result to this was proved by Green and Tao in [GrT08j using a "local Bo- 
golyubov lemma" that they developed specially for the purpose. Their argument can be 
used to show that QQ' is approximately constant on a Bohr subset B' of B. However, the 
local Bogolyubov lemma is rather expensive, in that the dimension of B' is considerably 
larger than that of B. This expense has to be iterated, and it turns out that if we were 
to use their result, then we would end up with a tower-type bound for our final estimate. 
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By contrast, the progression P that we find has the same dimension as that of B and 
the final estimate we obtain is doubly exponential. Unfortunately, our argument is rather 
uglier than that of Green and Tao since we rely on the fact that bilinear forms on multi- 
dimensional arithmetic progressions can be explicitly described, rather than just using the 
defining properties of bilinear forms on Bohr sets. 

Although we eventually need a statement about Bohr sets, we shall begin by proving 
a dichotomy for bilinear phase functions defined on multidimensional progressions. As a 
prelude, here is a proof for the special case of one-dimensional progressions. It will be 
useful for the general case if we prove a non-symmetric result where one variable belongs 
to one progression and the other to another of a possibly different length. We shall also 
allow our bilinear forms to be non-homogeneous. That is, we shall consider functions of 
the form b(x, y) = e(axy + Ax + fiy + v) with 1 < x < m\ and 1 < y < 77*2- The aim will 
be to prove that such a function either has a small average (where this means smaller than 
a small positive constant c) or is approximately constant on a reasonably large subgrid of 
[mi] x [m 2 ] (where this means a subgrid of size at least c'm 1 m2 for some not too small 
positive constant d, but d is allowed to be smaller than c and this elbow room will be 
quite helpful). The precise statement is as follows. As is standard, if 9 is a real number 
then we write ||#|| for the distance from 9 to the nearest integer. 

Lemma 6.1. Let c > and let mi and m 2 be positive integers. Let /3(x,y) be the bilinear 
phase function e(axy + Ax + fiy + v) , defined when < x < m\ and < y < m 2 . Suppose 
that \E x>y b(x,y)\ > 2c. Then there exists a positive integer q < 2c" 1 and an integer p 
such that \a — p/q\ < 2c~ 2 /mim2- In particular, \\axy\\ < 2c whenever x and y are both 
multiples of q and x/mi and y/m 2 are both at most c 3//2 . 

Proof. Observe first that ~E XtV b(x, y) < K y \K x b(x, y)\ = E y |E 2 .e(axy + Ax + fiy + v) \ . (Here, 
as in the statement of the lemma, the expectations are over all x and y with < x < m\ 
and < y < m 2 .) Now for each y, the quantity \K x e(ayx + Ax + fiy + u)\ is at most 
min{l, l/mi\\ay + A||}, by the formula for summing a geometric progression. In particular, 
it is at most c unless ||m/ + A|| < C/rrii, where C = 1/c. So the only way that K y \K x b(x,y)\ 
can be at least 2c is if \\ay + A|| < C jm\ for at least cm2 values of y (since otherwise we 
get less than c + c) . 

So now let us think about what is implied if ||m/ + A|| < C /mi for at least cm-i values of y. 
We shall show that a is within C'/mim2 of a rational with small denominator, where with 
the benefit of hindsight we choose C to equal 2C/c = 2c~ 2 . It is an easy and standard 
consequence of the pigeonhole principle that there is a rational p/q with q < m^/C 
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such that I a — p/q\ < C"/m 1 m 2 g. It follows that \ay — py/q\ < C'/m,iq for every y < m 2 . 
Therefore, either q < C'/C or ||A + py/q\\ < 2C/m\ whenever \\ay + A|| < C/m\. 

So now let us bound the number of multiples py/q of p/q such that A + py/q can be 
within 2C/rrii of an integer, given that p and q are coprime. To do this we split into cases. 
If q < mi/AC, then 1/q > AC/rrii, so two translates of multiples of 1/q that are distinct 
mod 1 cannot both be within 2C/mi of an integer. But since p and q are coprime, any q 
distinct multiples of p/q are also distinct multiples of 1/q mod 1, so at most one of them 
is within 2C/rrii of an integer when you add A to it. So if q < nii/AC, then \\ay\\ < C/rrii 
for at most q~ 1 rri2 + 1 values of y. (The is there because m 2 doesn't have to be a 

multiple of q.) If this is at least cm 2 , then q is certainly at most 2c _1 . 

If q > mi /AC then we argue differently. This time we argue that the number of multiples 
of 1/q that are distinct mod 1 and lie within 2C/vii\ of an integer is at most 2Cq/m\. 
Since q < m 1 m 2 /C", this is at most 2Cm 2 /C = cm 2 . Therefore, the number of y such 
that 1 1 m/ 1 1 < C/rrii is also at most cm 2 . 

In conclusion, either q < 2c -1 or K y \E, x b(x,y)\ = K y \K x e(axy + Ax + fiy + v)\ < 2c. 
In the first case, we have \a — p/q\ < C"/m 1 m 2 g = 2c _2 /m 1 m 2 g, which implies the first 
assertion. If x and y are multiples of q then < 2c _2 x?//mim 2 . Therefore, if in 

addition xy < c 3 mim 2 , then we have that < 2c. In particular, ||<xn/|| < 2c if x and 

y are both multiples of q and x < c 3//2 mi and y < c 3 / 2 m 2 . □ 

Let us now see how Lemma I6TT1 generalizes to a similar statement for bilinear phase func- 
tions on d- dimensional arithmetic progressions. This turns out to follow fairly straightfor- 
wardly from the one-dimensional case. 

So now x = (xi, . . . , Xd) and y = (yi, . . . , yd) range over (i-dimensional arithmetic pro- 
gressions, and we are looking at a function of the form b(x, y) = . a^Xiyj). We would 
like to show that either the average of b(x, y) is small or every is extremely close to a 
rational with small denominator. In the latter case, we will be able to restrict to a subpro- 
gression of the same dimension that is not too much smaller on which b is approximately 
constant. 

Corollary 6.2. Let c > 0, let mi, . . . ,rrid be positive integers and let P be the multi- 
dimensional progression Ylt=i{®i •"•> • • • j m * ~ !}■ Let b be a bilinear phase function on 
P given by the formula b(x,y) = e(^) i -a^x^yj + £\ \xi + Ylij^jVj + u )- Then either 
\K Xt yb(x,y)\ < 2c or there exist positive integers q rs < 2c~ 1 and integers p rs such that 
\a rs — p rs /q T s\ — 2c~ 2 /m r m s for every r,s < d. In the second case, there are positive 
integers qi, . . . , qd < (2c~ 1 ) 2d such that || J2 rs a rs x r y s \\ < 2d 2 c whenever x and y belong to 
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the subprogression P' that consists of all z G P such that each z r is of the form h r q r for 
some h r between and c 3 ^ 2 m r . 

Proof. Let us fix all coordinates of x and y apart from x r and y s and estimate the quantity 
|E a;r E 2/s 6(x, y) | . We can write this expression in the form \K Xr ^ g e(a rs x r y s + Xx r + \iy s + v) \ , 
where A, fi and v depend on the other coordinates of x and y. Therefore, by Lemma 16.11 
either \E Xrtya b(x, y)\ < 2c or there exists a positive integer q rs < 2c" 1 and an integer p rs 
such that \a rs —p rs /Qrs\ < 2c/m r m s . 

In the second | < 2c whenever x r and y s are both multiples of q rs and 

x r /m r and y s /m s are both at most c 3 / 2 . A quick examination of the proof of Lemma [6.11 
shows that the choice of q did not depend on A, \x or u, but only on the rational approxi- 
mations to a. Therefore, by averaging over all possible values of the other coordinates of 
x and y we may conclude that either \K x ^ y b(x,y)\ < 2c or there exists a positive integer 
q rs < 2c~ l and an integer p rs such that \a rs —p r s/<lrs\ — 2c/m r m s . (This is the same 
conclusion as that of the previous paragraph, but the assumption is different, since now 
we are averaging over all x and y rather than fixing all but one coordinate.) In the second 
|| < 2c whenever x r and y s are both multiples of q rs and x r /m r and y s /m s 
are both at most c 3//2 . 

Since this is true for every r and s, either \E Xty b(x, y)\ < 2c or there are d 2 positive 
integers q rs < 2c~ x such that ||a; rs x r y s || < 2c whenever x r and y s are both multiples of q rs 
and x r /m r and y s /m s are both at most c 3//2 . For each r, let q r be the product of all the q rs 
and all the q sr . Then q r is at most (2c _1 ) 2d , and if x r is a multiple of q r and y s is a multiple 
of q s with x r /m r and y s /m s both at most c 3 / 2 , then again ||o! rs x r 7/ s || < 2c. But if that is 
true for every r and every s, then || J2 r s ot rs x r y s \\ < 2d 2 c, which proves the result. □ 

Our next target is to prove that quadratic averages either have small U 2 norms or are 
uniformly close to functions with moderately small £/ 2 -dual norms. We begin with a lemma 
about linear phase functions on subsets of Z^. Before stating it, let us give a definition 
that generalizes our earlier concepts of interior, closure and boundary to arbitrary pairs of 
sets. 

Definition. Given a pair (A,B) of subets ofL^, define the closure of A (relative to B) to 
be A + B and the interior to be {x : x + B C A}. Denote these by A + and A~ , respectively. 
Define the boundary of A to be A + \ A~ and denote it by OA. 

As before, when we use the notation A + , A~ and dA, it will always be clear from the 
contexts what the set B is that we are implicitly talking about. 
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Lemma 6.3. Let A be a subset ofL^, let <p : A — >• Zjy be a Freiman homomorphism, let 
B be a subset of A — A that contains 0, and let ip be the function ip(d) = <p(x + d) — <fi{x) 
for some x G A D {A — d), which is well-defined everywhere on B. Let the densities of A 
and B be 7 and 9. Let f be the function defined by taking f(x) = 7 _1 u; < ^ when x G A 
and otherwise, and let g be defined by taking g(d) = O" 1 ^^ whenever d G B, and 
otherwise. Then \\f — f * g\\oo — 27~ 1 , and f — f * g is supported inside the boundary dA. 

Proof. First let us deal with the uniform bound for f — f*g. Since ||/||oo — 7~\ it is enough 
to prove that ||/*<7||oo < 7 _1 - But this is clear because f*g(x) = E deB f(x — d)u^ d \ which 
is an average of numbers with absolute value at most 7 _1 . (This equality is the reason for 
normalizing g with the constant 6~ l .) 

If x ^ A + = A + B, then f(x — d) = for every d G B, so / * g{x) = 0. Since G B and 
/ is supported in A, f\x) = as well. 

If x G A", then 

/ * g(x) = E deB f(x - d)u*V> = E deB uj^- d ^ = E deB u^ = f(x). 
This proves the lemma. □ 

We would like to think of f * g as approximating /, so we shall apply Lemma [6.31 to a 
pair of sets A and B such that dA is small. We have already seen such pairs in the context 
of regular Bohr neighbourhoods, but we now need to look at multidimensional arithmetic 
progressions as well. 

Lemma 6.4. Let P be a proper d- dimensional arithmetic progression consisting of all 
points xo + Yli=i a i x i suc h that < aj < m i; let e > 0, and let Q be the progression 
consisting of all points J2i=i^i x i suc h that < 6j < errii/d. Let the density of P be 7. 
Then the density of P + \ P~ is at most 3e7 and the density of Q is at least (e/d) dr y. 

Proof. The number of integers of the form r + s, where r is an integer between and m — 1 
and s is an integer such that < s < rjm is at most (1 + 77)771, since we have equality when 
77m is an integer, and if we increase 77m towards the next integer then we increase (1 + 77)774 
without increasing the number of elements of the set. 

Now suppose that r is an integer and that r > \r]m\ . Then r — s > whenever s 
is an integer and s < rjm. The number of integers less than m with this property is 
m — [f]m\ > mil — 77). 
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From these two calculations, we find that P + has density at most (1 + e/d) d/ -f and P~ 
has density at least (1 — e/d) d, ~f. The first result now follows from the simple estimates 
(1 + e/d) d < 1 + 2e and (1 - e/d) d > 1 - e. 

Also, the number of integers r such that < r < rjm is \rjm} > 77m, so the density of Q 
is at least {e/d) d times that of P, so we have the second assertion as well. □ 

Next, we show why approximating a function by a convolution of two functions helps us 
to control its £/ 2 -dual norm. 

Lemma 6.5. Let A and B be two sets and let f and g be two functions such that \ f\ is 
bounded above by the characteristic measure of A and g is bounded above by the charac- 
teristic measure of B. Suppose that the density of A is 7 and the density of B is 9. Then 

\\f*g\\h<T ll2 0~ l/i - 



Proof. Let h be any other function, and define g* by g*(x) = g(—x) for every x. Then 
(/ *9,h) = (f, g**h}< \\f\\ 2 \\g* *h\\ 2 < r 1/2 \\g\\u4h\\u* < 7~ 1/2 0~ 1/4 |Hlr/*, 

from which the result follows. Here we have used the fact that the characteristic measure 
of a set of density 5 has L 2 norm at most <5 _1//2 and U 2 norm at most 5 -1 / 4 . We have also 
made use of the inequality ||u * v\\ 2 < ||u|| ( 72||t>|| [ ;2, which can be thought of as a special 
case of Young's inequality or as a special case of Lemma 3.8 of |G01j . a Cauchy-Schwarz 
inequality for the uniformity norms. □ 

Putting the last three lemmas together, we deduce the following. 



Lemma 6.6. Let P, 7, e and Q be as in Lemma \6.4\ Let <f> be a Freiman homomorphism 
defined on P , and let f(x) = j^uj&w if x G P and otherwise. Then there exists a 
function h such that \\f — h\\oo < 27 -1 , ||/i||^ 2 < 7~ 3 / 4 (e/<i)~ d//4 , and f — h is supported in 
P + \ P~ , which has density at most 3e7. 

Proof. Let us apply Lemma 16.51 with A = P,B = Q,f&s given in this lemma, and g 
as defined in the statement of Lemma 16.31 We shall prove that we can take h to be the 
function f*g. Lemma lfT3l tells us that ||/ — f*g\\oo — 27 _1 and that / — / *g is supported 
in P + \ P~. Lemma \6. 41 tells us that P + \ P~ has density at most 3e7, and lemma [6751 tells 
us that ||/ * g\\u 2 ^ 7 -1//2 #~ 1//4 , where 9 is the density of Q. Lemma [6.41 tells us that 9 is 
at least (e/d) dr Y, and this completes the proof. □ 
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We are about to prove a slightly complicated technical lemma that will help us handle 
error terms without cluttering up proofs. Before we do so, here is a much simpler technical 
lemma that will help us to prove the complicated one without cluttering up its proof. 

Lemma 6.7. Let a,(3,p and a be positive constants. Let U and V be subsets of of 
density act and f3, respectively. For each y £ V let g y be a function supported in y + U 
such that ||<7y||oo < pa" 1 . Then \\K ye v9y\\oo < pcr/3" 1 . 

Proof. For each x, 

\ E yev9 y {x)\ < pa _1 P|z Ey + U\y e V] < pa~ x aa^ x = pa(5~ l . 
The lemma follows. □ 



Lemma 6.8. Let (A,B) and (C,D) be two pairs of subsets of %n with C + C B. Let the 

densities of A and C be (3 and 7, respectively. Suppose also that dC has density at most 
e~y. Let g be a function defined on 7*n such that \g(x)\ < for every x £ A and g(x) = 
for every x £ A. For each y £ A let g y be a function such that \\g y \\oo — 7 _1 and g y is 
supported in y + C. Suppose that K y& Agy{x) = g{x) for every x £ A". Now suppose that for 
each y £ A" there is a function h y such that \g y (x) — h y (x)\ < 6*7~ 1 for every x £ y + C~ , 
\g y {x) — h y (x)\ < A7 _1 for every x £ y + dC , and g y (x) = h y (x) = whenever x y + C + . 
And for each y £ A\A~ , let h y be identically zero. Let h(x) = E ygj 4/i y (x) for every x £ Zjv- 
Then \g(x) — h(x)\ < (6 + Xe)f3~ 1 for every x £ A~ , \g[x) — h(x)\ < (4 + Xe)^ 1 for every 
x £ OA, and g(x) = h(x) = for every x A + . 

Proof. If x £ A~ then M y£ A9y{x) = g{x), by hypothesis. If a; £ dA, then Lemma [67T1 (with 
U = C and V — A) implies that \E ye A9y{.x)\ < (3~ 1 , which implies that \g{x)— E, ye A9y(x)\ < 
2/3 -1 . And if x ^ A + , then both g(x) and ¥, y£ A9y{x) are zero. 

Let us write u y for the restriction of g y — h y to y + C~ and v y for the restriction of g y — h y 
to y + dC. Then g y — h y = u y + v y for every y £ A. If y £ A~ , then H^Hoo < 6 I 7~ 1 and 
||wy||oo < A7 _1 . If y £ A \ A~ then Hm^Hoo and are both at most 7 -1 . 

For every x £ A~ , \K y& AU y (x)\ < 6j3~ l by Lemma [6.71 (with U = C~ and V = A). For 
every x £ dA, \E y€ AU y (x)\ < 2(3~ 1 , again by Lemma [67fl (In this case, we have the bound 
Halloo < 27" 1 .) And for every x A + , E ye ^M y (x) = 0. 

If x £ A + , then |Ej, €j 4i>y(a;)| < Ae/3" 1 , again by Lemma [67T1 (this time with U = dC). 
And if x A + , then E ye ^w y (x) = 0. 



LINEAR FORMS AND QUADRATIC UNIFORMITY FOR FUNCTIONS ON Z N 25 

Adding these estimates together, we find that \K y£ A9y(x) — ^ y eAh y (x)\ is at most (9 + 
Ae)/3 _1 if x G A~ , at most (2 + Xe)f3^ 1 if x G OA, and if x ^ A + . Finally, combining this 
with the estimates for g — K y ^Agy in the first paragraph, we obtain the result stated. □ 

In the next statement, it may not be clear why t] cannot be taken to be arbitrarily small. 
The reason is that the maximum possible density 7 decreases with 77, so in fact the bound 
on 1 1 Q" || y 2 increases as 77 decreases. 

Corollary 6.9. Let < e < 1 and let < rj < 1/20. Let B be a regular Bohr set of 
density j3 and let B' be a Bohr subset with B' -< f] B. Let P be a d- dimensional arithmetic 
progression of density 7 such that P + P lives inside B' , let q be a quadratic form on B 
and let Q be a generalized quadratic average with base (B, q). Then for every a > 0, either 
\\Q\\u 2 < (1177 + a) 1 / 8 or there exists a function Q" such that \\Q — Q"\\oo < And 2 ct + 2e + 7r] 
and \\Q"\\* U2 < i~V\e/dy d l 4 , where 7' > (a/4) 4d2 7 . 

Proof. Suppose first that Q has rank at most log(l/a) relative to P. In this case, we are 
immediately done, since Lemma [5\2l tells us that ||<5||c/ 2 < (H?? + a) 1//s - 

Now suppose that Q has rank at least log(l/a) relative to P. As usual, let us begin by 
assuming that Q is a non-generalized quadratic average, so that it has a formula of the form 
Q(x) = ¥* y( z x _BU qy<yX \ where q y (x) = q(x — y) + (p y (x — y) for some Freiman homomorphism 
(f) y defined on B. For each y let us define f y (x) to be f3~ l uj qy<yXSl if x G y + B and 
otherwise. Then, as we commented after defining quadratic averages, Q is the average of 
all the functions f y . The strategy of our proof will be to show that each function f y can 
be approximated by a function with small t/ 2 -dual norm in such a way that the average of 
all the errors is uniformly small. 

We shall begin by examining / = /o, which is supported in B. By the definition of rank, 
we have the inequality 

|E a , a ,, b ,^ e pa;^ +b - a '- b ')-^- a ')-^ fe - b ')| > a. 

It follows that there exist a' and b' in P such that 

|E a>beP u; <?(a+ ''- a '- b ' ) - <?(a - a,) - ,?(b - 6 ' ) | > a. 

Choose such an a' and b', and write /3(u, v) for q{u + v) — q(u) — q(v). Then 

q(a + b-a'- b') - q(a - a') - q(b - b') = 0(a -a',b- b') 

which we can expand into the homogeneous part /3(a, b) and the linear and constant terms 
-f3(a',b)-(3{a,b') + (3(a',b'). 
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Let us discuss further the relationship between q and /3. A quadratic homomorphism on 
P must be given by a formula of the form q(x) = Ylij a ij x i x j + Yli + c for a matrix 
{dij) that we may take to be symmetric (since we can replace it by (a^ + a^)/2). Then 
(3(u,v) works out to be 2^2ij a>ijUiVj, an d there are coefficients b[ and and <f such that 
f3(u — a',v — b') = 2 Y^ij ciijUiVj + + cji>j + d! . Moreover, q(x) = (3(x, x)/2 for 

every x. 

Since |E ai f, e pu;^ a ~ a '' b ~ 6 ')| > a, Corollary 16.21 (with c = a/2) implies that there is a 
subprogression P' of P of density at least (a/4) M2 (a/2) 3d / 2 7 and of dimension d such 
that |1 - uj^ b )\ < 2nd 2 a for every a and 6 in P' . If we restrict further, to pairs (a, 6) 
such that all their coordinates are even, then we obtain a progression P" of density 7' > 
2- d (a/4) 2d2 (a/2) M / 2 7 > (a/4) 4d2 7 and of dimension d such that \l-ui^ a W 2 \ < 2nd 2 a for 
every a and 6 in P" . Let us set 6* to be 2ird 2 a. Then there is a Freiman homomorphism 
defined on P" such that \u q ^ — < 9 for every a G P". 

We now apply Lemma 16.61 to the function I defined by l{x) = ^'~ l oj^ x > when x G P" 
and l(x) = otherwise. It gives us a subprogression P 3 C P" and a function /i' such that 
||Z - Zi'lloo < 2V -1 , ll/i'H^ < 7'- 3/4 (e/d)- d / 4 , and I - h' is supported on 3P", which has 
density at most 3e7 7 . (Here the boundary is taken with respect to P3, and e denotes the 
proportion of P" that we take to lie in P3). 

Let us define f'(x) to be 7 , ~ 1 u;' ? ( :!; ) if x E P" and otherwise. The above calculations 
show that \f'{x) —h'(x)\ is at most 9 , y'~ l when x G P"~, at most (2 + 0)7 /_1 when x G <9P", 
and when x £ P" + . Moreover, the density of dP" is at most 3e7'. 

We are preparing to apply Lemma \6. 8 1 Our pairs of sets are (B, B') and (P", Q), which 
satisfy the hypothesis since P" + = P" + P3 C P + P C P'. Our function g is defined by 
taking g(x) = /3 _1 u;^ x ) if x G P and otherwise. If y G P~ then we shall define g y (x) to 
be 'j'~ 1 uj q ^ if x G y + P" and otherwise. (This is the normalized restriction of tu q ^ to 
y + P".) If y ^ P _ we shall define g y to be identically zero. Then if x G P - , we have 
E^B^a;) = •y'~ 1 f3g(x)F[x G y + P"|y G P] = g(x), so the hypotheses about g and the g y 
are satisfied. 

The function /' just discussed was equal to go- For each fixed y G P~ the function 
q(x) — q(x — y) is a Freiman homomorphism on ?/ + P", so the argument used for / can 
also be used to provide for us a function h y such that ||/i y ||^2 < 7 /_3 / 4 (e/<f)~ d / 4 , and such 
that \g y (x) — h y (x) \ is at most 9^'~ 1 when x G y+P"~, at most (2+#)7 /_1 when x G y+dP", 
and otherwise. Thus, in Lemma [6781 we can take /3 to be /3, 7 to be 7', 6* to be 9, e to be 
e, and A to be 2 + 6*. 
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We then set h(x) = K xeB h y (x). By Lemma [6781 \g(x) —h(x)\ is at most (6 + 2e + 6e)f3~ l 
for every x G B~, at most (4 + 2e + 9e)(i~ l for every x G 95, and = h(x) = for 
x (jt B + . Moreover, since h is just the average of all the h y , the triangle inequality implies 
that \\h\\* u2 <i-^\eld)- d l\ 

Now g is the function f Q defined at the beginning of the proof, where we defined f y {x) to 
be p-^M if x G y+B and otherwise. Since q y {x)—q(x—y) is a Freiman homomorphism 
on y + B, the same argument gives us a function k y such that l/j/x) — k y (x)\ is at most 
(0 + 2e + #e)/3 -1 for every x G ?/ + B~ , at most (4 + 2e + 9e)[i~ l for every x G dy + 5, and 
= = for x £ y + £+. Also, \\k y \\* u2 < i^'\e/d)~ d /\ 

We now apply Lemma 16.81 once again, but this time it is simpler because our set A will 
have empty boundary. Indeed, we take A and B to be Zjv, C to be B, and to be B' . 
This time round we can take to be 1, 7 to be /?, e to be 77, # to be 6* + 2e + #e, and A to be 
4 + 2e + #e. Then Q(x) = ~K y( z x _ B uj qy ^ by definition, and this is equal to E, ye z N f y (x) . Let 
Q"(x) = E y&N k y (x). Then Lemma ES tells us that WQ-Q"]^ < ^ + 2e+^e + (4+2e+^e)r/. 
Moreover, ||Q"|I^ < 7'- 3/4 (e/rf)- d / 4 , again by the triangle inequality. □ 

We now combine Corollary 16.91 with a result of Ruzsa so that we can say something 
about bilinear phase functions defined on Bohr sets. 

Theorem 6.10. Let Q be a generalized quadratic average of complexity (d,p). Then for 
every a with < a < 1/20, either \\Q\\u 2 < (12a) 1 / 8 or there exists a function Q" such 
that \\Q - < 16d 2 a and \\Q"\\* u2 < (4/a) 4d2 (800d 2 /p) d . 

Proof. Suppose that Q has base (B,q), where B = B(K,p) and K has cardinality d. Let 
i] = a and let B' -< a B. Then B' = B(K,o~) for some a > ap/400d. A theorem of 
Ruzsa |R94j (see also |N96] ) tells us that B' contains a proper d- dimensional arithmetic 
progression of density at least (a/d) d . Therefore, there is a proper d- dimensional arithmetic 
progression P of density 7 > (o~/2d) d such that P + P C B' . By Corollary 16.91 with 77 = a 
and e = ad 2 (if e > 1 then Corollary 16.91 is trivial so we do not need to worry about this), 
either ||Q||t/2 < (12a) 1 / 8 or there exists a function Q" such that \\Q — <5"||oo < I6d 2 a and 
||<9"||^2 < {a/ A)-' id? (a /2d)-' id/A {ad)- d / A . A small back-of-envelope calculation shows that 
this is at most the bound stated for ||Q"||^2- D 

7. A MORE PRECISE DECOMPOSITION THEOREM 



Theorem 13.41 stated that every function that is bounded above in L 2 can be decomposed 
into a linear combination of quadratic averages plus a sum of two error terms, one of which 
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is small in U 3 and one in L\. The aim of this section is to prove a refinement of this 
statement. Once again, we shall show that a function / with H/H2 < 1 can be decomposed 
as a linear combination of quadratic averages plus a small error. However, we shall collect 
these quadratic averages into a small number of "clusters" in such a way that two quadratic 
averages that belong to the same cluster will have a low-rank difference. Then the results of 
the previous section will allow us to express each cluster as a product of just one quadratic 
average with a function with small f/ 2 -dual norm. We proved an analogous theorem for F": 
after the hard work of the previous section, the rest of the adaptation is relatively routine. 

First let us combine Theorem 16.101 with Lemma 14.21 in order to describe what happens 
if two generalized quadratic averages have a significant correlation. The following lemma 
should be thought of as a companion to Corollary 15.31 The appearance of the generalized 
quadratic average Qo in the statement may look a bit strange: it is there for technical 
reasons that will be explained later. 

Lemma 7.1. Let B and B' be two arbitrary Bohr sets and let the complexity of B D B' be 
(d, p). Let q and q' be quadratic forms on B and B' . Let Q and Q' be generalized quadratic 
averages with bases {B,q) and (B',q') and suppose that (Q,Q') > C- Let Qo be another 
generalized quadratic average with base (B,q). Then there exists a function Q'" such that 
HQo^-Q'loc < C/2 + rf 2 C 8 /2 5 and \\Q"% 2 < (2 n / C 8 )^ (800d 2 / p) d . 

Proof. Let 77 = C/36. Let B\ and B 2 be regular Bohr sets such that B 2 -<r, B\ -< v B n 
B' . Then Lemma 14.21 tells us that there is a generalized quadratic average Q" with base 
(B 2 , q - q') such that HQQ 7 - Q"^ < I877 = (/2. 

Since Q also has base (B, q), the same argument gives us a generalized quadratic average 
Ql with base (B 2 , q - q') such that \\QoQ 7 - Q'^ < (/2. 

Now E x Q"(x) > C - I877 = C/2, so \\Q"\\u2 > (/2. Therefore, if we set a to be ( 8 /2 9 , 
then Theorem 16. 101 implies that there exists a function Q'" such that — Q w ||oo < lQd 2 a 
and \\Q"'\\* U2 < (4/a) 4d2 (800rf 2 /p) d . 

However, we wanted a similar statement for Qq rather than Q" . This does not quite 
follow from Theorem 16. 10} but it follows from the proof. A quick examination of Corollary 
16.91 reveals that the alternatives in question depend just on the rank of Q and not on Q 
itself. (To be precise, if two quadratic forms have the same base and the same rank with 
respect to P, then there must be one half of the dichotomy that applies to both forms.) 
Therefore, we obtain the result stated. □ 
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What will be crucial to us later, if we want a reasonable bound, is that the U 2 -dual 
norm of Q'" in the above lemma depends polynomially on ( for fixed d. It is for this reason 
that it would have been too expensive to use Green and Tao's local Bogolyubov lemma to 
prove Theorem 16.101 That would have allowed us to prove an analogue of Corollary 16.21 
for bilinear phase functions defined on Bohr sets. However, the subset we passed to would 
then have been a Bohr set whose dimension depended polynomially on c, whereas in fact 
we passed to a multidimensional progressions without any increase in dimension. That 
would have translated into an exponential dependence on £ in Lemma [7. II 

Unfortunately, before we prove our more precise decomposition result we must deal with 
another technical difficulty that did not arise for quadratic averages on F" which is that 
Q^ 1 does not in general equal Q. In our previous paper it was convenient to write Qj 
as QiQiQj. In order to do something similar in the Zjv case we shall first show that for 
every regular Bohr neighbourhood B and every quadratic function q : B — > 7*n we can find 
a smaller Bohr neighbourhood B 1 and a quadratic average Q with base (B', q) such that 
|C/(x)| = 1 for almost every x. The statement of Lemma 17.11 is designed so that we will 
then be able to replace any given Qi by a quadratic average with this convenient property 
and with the same base. 

We begin by proving, using a very standard argument, that 7*n can be covered fairly 
efficiently by copies of B. 

Lemma 7.2. Let B = B(K,p) be a Bohr set and write d for the size of k and ft for the 
density of B. Then there is a set {Bi, . . . , B m } of translates of B such that m < 5 d f3~ l 
and every point in belongs to at least one Bi. 

Proof. A basic fact about Bohr sets is that the Bohr set B" = \B(K, p/2)\ has density at 
least 5~ d j3. (See for example |GrT09b] , Lemma 8.1.) Let x±, . . . , x m be a maximal collection 
of points with the property that the translates Xi + B" are disjoint. Then m < 5 d /3 _1 . Also, 
the sets Xi + B cover Z/v, since if x Xi + B for any i, then x + B" and Xj + B" are disjoint 
(or x would belong to X{ + B" — B" C x\ + B) . □ 

The condition B' ~< e / 5 d B that appears in the next corollary may look rather expensive 
with its exponential dependence on d, but the effect on our eventual bound is not par- 
ticularly serious: the density of B' is exponential in d 2 instead of d. When we come to 
apply the result, d will be bounded above by (2/5) Co for some absolute constant Co, so 
this decrease in the density is comparable to the result of replacing Co by 2Cq. 
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Corollary 7.3. Let e > 0, let B = B(K,p) be a regular Bohr set, and let q be a quadratic 
function defined on B. Let d = \K\, let B' -< e /^d B and let B" be a Bohr set such that 
B" — B" C B' . Then there is a quadratic average Q with base (B", q) such that for all but 
at most eN values of x the restriction of Q to x + B" is a quadratic phase function. In 
particular, \Q{x)\ = 1 for all but at most eN values of x. 

Proof. Let B\ , . . . , B m be a sequence of translates of B given by the previous lemma, with 
Bi = Xi + B. On each Bi, let be the function qi(x) = q(x — Xj). Now let us greedily 
make the sets Bi disjoint, by letting B\ = B t \ (Bi U ■ ■ ■ U -Bj_i) for each i. 

We are trying to define a function of the form Q(x) = K y&x _B"Co qy( - x \ so it remains to 
choose the functions q y appropriately This we do by letting q y (x) = q%{x) for the unique 
i such that y G B[. Since q%{x) = q(x — Xi) = q((x — y) — (x, — y)), this is of the form 
q(x — y) + (f) y (x — y) for some Freiman homomorphism cf) y : B' — > Zjy, as required. 

Now each qi is a quadratic homomorphism on 5j, so the restriction of Q to x + B" will 
be a quadratic phase function if there exists i such that x + B" — B" C B[. A sufficient 
condition for this is that, for every i, either x — B' C Bi or (x — B') D Bi — 0. But Lemma 
12.21 implies that this is true for all but at most 5~ d em\B\ < eN values of x, as claimed. □ 

The property we have just obtained is a useful one, so let us give it a name. Note that 
the Bohr set B from Corollary 17.31 is no longer explicitly mentioned, but its width and 
dimension appear (in disguised form) as the parameter m below. 

Definition. We say that a quadratic average Q with base (B' ; ,q) is (e, m) -special if the 
following holds. There exist at most m elements x±, . . . , x m G Z,jv such that for all but at 
most eN points x G Zjv the restriction of Q to x + B" is equal to the restriction of u qi to 
x + B" , where qi(x) = q(x — Xi). 

We shall not need this definition in the rest of this section, but it will be used in the 
next section. 

The following lemma (which has a simple proof) appears in |GW09aj as Corollary 2.11. 

Lemma 7.4. Let u±, . . . , u n be a collection of vectors of norm at most 1 in a Hilbert space 
H, let Ai,...,A n be scalars with J^" =1 |Aj| < C and let 5 > 0. Then there are vectors 
u^, . . . ,Ui k and a set A C {1,2, ... ,n} such that k < 2C 2 /S 2 , and with the following 
properties: \\ Y2%eA^i u i\\^ — ^> an d for every i £ A there exists j such that |(itj,ttj.)| > 
5 2 /2C. 
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We are now ready to state and prove the main result of this section. It is important for 
us to be able to vary the parameter e below independently of the quantity C. 

Theorem 7.5. Let f : Zyv — > C be a function such that \\fW2 < 1, and let 5 > 0. Let 

C = 2 2 \ d = (2/6) Co , p = (5/2) Co and C = A(2/6 2 ) Co , and let e > be at most (5/2) 5Co . 
Then f has a decomposition 

k 

f(x) = Y,Q^)U t (x)+g(x) + h(x), 

i=l 

with the following properties: k < 2C/5 2 , the Q[ are quadratic averages on Zyv with com- 
plexity at most (d,ep/800d5 d ), £* = i \\ U i\\*u* < (S/e 8 ) 4d2 (2 20 d 3 5 d /ep) d C , YH=x IMU < 2C ; 
ll^lli < 35 and \\h\\jj3 < 5. Moreoever, the quadratic averages Q\ are (e, (5/p) d ) — special. 

Proof. By Theorem 13.41 / can be decomposed into a sum £\ \Qiix) + g'(x) + h(x), where 
each Qi is a quadratic average of complexity at most (d,p), and \\g'\\i < 5, \\h\\u3 < S and 

EM<c 

Suppose that Qi has base (Bi, qi). Lemma 17721 and Corollary E3] tell us that if B[ -< e / 5 d B { 
then there is a quadratic average Q\ with base {B[, qi) which is (e, 5 d /3 _1 )-special, where f3 
is the density of the base of Qi. In particular, = 1 for all but at most eN values of 

x. Furthermore, Lemma I47T1 gives us a generalized quadratic average Q" with base (B'^qi) 
such that || — Q" ||oo < 6e/5 d , which is at most 2e. Note that the complexities of Q\ and 
Q'l are at most (d, ep/800d5 d ). (The additional factor of 2 stems from the requirement that 
B" - B" C B' in Corollary O) 

Now we apply Lemma [7741 to the linear combination £\ XiQi- Without loss of generality, 
the functions that it gives us are Qi, ■ ■ ■ ,Qk- Then Corollary 17.41 tells us that we can 
write J2i\iQi in the form Yli=iJ2jeAi^jQj + where k < 2C 2 /5 2 , \\g"\\2 < 8 and 
\{QiiQj)\ > 5 2 /2C for every i < k and every j G A4. In order to proceed, we must 
rewrite this decomposition in terms of the functions Q\. That is, we wish to take the sum 
Ej 6 A, X jQj and replace it by Q\ ^ jeA . XjQtQj. 

Since \\Qi - Q'lW^ <2e< 5 2 /AC, we have that \(Q'l,Qj)\ > 2e for every j G A,. There- 
fore, since Q\ and Q" have the same base, Lemma 17.11 (with £ = 2e and p replaced by 
ep/800d5 d ) tells us that each function Q'iQj with j G Aj can be written as F + G, with 
||G|U < e + 8d 2 e 8 and ||F||^ 2 < (8/e 8 ) 4d2 (2 20 d 3 57ep) d . Therefore, EjeA^MQj can be 
written as F+G with ||G||oo < (e+8d 2 e 8 ) £\ eA . |A^ and ||F||^ 2 < (8/e 8 ) 4d2 (2 20 d 3 57ep) d ^ jeA . |A,- 
(In this proof the functions F and G may vary from line to line.) This implies also that 
Halloo < 2 57J Jgj 4. |Aj| (since, as can easily be checked, e + 8d 2 e s < 1). 
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Since |Qi(a^)| 2 = 1 for all but at most eiV values of x, we have the estimate ||1 — |Qi| 2 ||2 < 
e. It follows that 

Hence, J2j£Ai ^jQj can De written in the form Q'JJi + Vi with ||Vi||i < (2e+8<i 2 e 8 ) J2jeAi IAj'I 
and ll^ll^ < (8/e s ) id2 (2 20 d 3 5 d /ep) d J2 j eA l \ X j\- 0ur bound for II^IU in the previous 
paragraph also gives us that ||t^i||oo < 2 ^ . gA . |A 3 -|. 

Putting all this together, we find that ^ i=1 J2jeA t ^jQj can rje written as Y^=i UiQ'i + 
V, with Etill^ll^ < (8/e 8 ) 4d2 (2 2 W/ep) d c7, Eli II^IU < 2C, and \\V\U < (2e + 
8rf 2 e 8 )C < 5. We have therefore written / as £ti Z/iQj + (V + g" + g') + h. Since all of 
|x, \\g"\\i and Hp' ||i are at most 6, the theorem is proved. □ 



8. The structure of a function QU when Q has low rank 

The main result of the previous section gives us a decomposition of the form / = 
Y2i=i Q'JJi + g + h, where g and h are error terms, the functions C7» have bounded U 2 - 
dual norms, and the Q\ are quadratic averages. Moreover, the quadratic averages are 
(e, m)-special, an important property which we shall make use of shortly. 

The aim of this section is to find a "structured set" S such that the functions uji( x - x i)+'l , i( x - 
are all approximately S'-invariant, where this means that they do not vary much if you add 
an element of S to x. We already have many of the tools to do this: the main task of this 
section will be to develop a little further some of the results of the last two sections. We 
shall soon say what a structured set is, but one can think of it as a set that resembles a lat- 
tice convex body in the way that a Bohr set or a multidimensional arithmetic progression 
does. 

As in Section [6] we shall make use of the fact that a quadratic homomorphism defined 
on a multidimensional arithmetic progression can be explicitly described. We shall use 
elements from the proofs of some of the lemmas in that section. 

It may seem as though the next lemma has basically already been proved in Section [61 
In a sense, that is true, but we need to run the argument again in order to make very 
clear that the phase function / that appears in the statement below is independent of the 
translate of P' . Later we shall see why that is so important. 

Lemma 8.1. Let B be a regular Bohr set, let P be a d- dimensional arithmetic progression 
such that P + P C B, let q be a quadratic homomorphism defined on B, let Q(x) = co q ^ 
for every x G B, and suppose that the rank of Q with respect to P is at most log(l/a). Let 
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e > and let 9 = a 2 e/8d 2 . Then there is a subprogression P' C P of dimension d and size 
at least (a/8) 2d2 9 d \P\, and a multiplicative Freiman homomorphism f from P to the unit 
circle inC, such that \Q{x)f(x)—Q(y)f(y)\ < e whenever x—y G P' . Moreover, if Q' = Qg 
for some multiplicative homomorphism g, then we can choose the same subprogression P' 
to work for Q' . 

Proof. First, recall from the proof of Corollary 16.91 that the restriction of g to P is given 
by a formula of the form g(x) = a ij x i x j + Si + c. Here, we are writing a typical 
point x G P as u + £\ XiUi, where < x, < wij. Moreover, there are coefficients 6^ and c- 
such that, setting /?(«, v) = 2 a^UiVj + b\ui + c-Vj, we have |E Ujl)g pe(/3(u, u))| > a. 

Corollary 16.21 (with a = 2c) then gives us rational approximations |2a»j —Pij/q%j\ < 
8a~ 2 / mirrij , with < 4a -1 . 

Now let x = Mo + XjMj and y = m + ^(xj + be two points in P. Then 

g(j/) - g(x) = aijWiWj + + ^ a^x^Wi + + ";>•'•;)«> 

As in the proof of Corollary I6.2[ let g 4 = J7 . g^ x FJ . g^. Then g, < (4a _1 ) 2d . Suppose now 
that each Wi is even and a multiple of g« and that io, < 6*mj. Then HayiUjXj || and ||ayM;jiUj|| 
are both at most 8a~ 2 9, since Wi is an even multiple of g^, |2ay — Pij/qij\ < 8a~ 2 /mirrij, 
and it?j and x,- are both at most m^. Therefore, if 6> < a 2 e/8d 2 , we find that 

W 9(w)-9(x) ~ e e (2^6 iWi ). 

i 

Let us therefore define f(x) to be e(— 2 £\ frjXj). Then 

IG(tf)/(v) " 0(a:)/(x)| = |w«W"«We(-2 - 1| < 6, 

i 

which proves the first statement. 

The second statement is trivial: if Q 1 = Qg then all we have to do is choose the same 
subprogression P' and replace / by fg^ 1 . □ 

It follows from this lemma that if X is a set on which / is approximately equal to 1, 
then Q is roughly constant on translates of XnP'. We shall now prove that such sets have 
a structure that is similar to that of Bohr sets. 

To do this, we shall make use of the notion of Bourgain systems. This is an abstract 
notion introduced by Green and Sanders |GrS07j that is designed to capture the properties 
one actually uses of Bohr sets in most applications. A Bourgain system of dimension d is 
a collection of sets X p , one for each p G [0,4], satisfying the following properties. 
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• If p' < p then X p > C X p . 

• o g X . 

• x p = —Xp. 

• If p + p' < 4 then X p + X p / C X p+p/ . 

• If p < 1 then |X 2p | < 2 d \X p \. 

An important fact about Bourgain systems is that there is an analogue of the notion of 
a regular Bohr set. The next lemma is Lemma 4.12 of |GrS07] (though we have stated it 
slightly differently). 

Lemma 8.2. Let (X p ) be a Bourgain system of dimension d and let < r < 1. Then 
there exists p G [r/2,r] such that |X p ( 1+K )| < (1 + 10dhi)\X p \ and X p ^_ K ) > (1 — 10c/k)|X p | 
whenever < lOdn < 1. 

If p has this property, we shall call X p a regular set in the system (X p ). (This terminology 
is not quite the same as that of Green and Sanders, but is close to the standard terminology 
for Bohr sets.) 

As we did for Bohr sets, we define a notion of one set in a Bourgain system being 
"central" in another. 

Definition. Let (X p ) be a Bourgain system and let < a < p < 1. We shall say that 
X a is e-central in X p and write X a -< e X p if both X p and X a are regular sets, and a G 
[ep/400rf,ep/200rf]. 

Note that by Lemma 18.21 we know that if X p is regular then there exists a such that 
X a -< £ Xp. Lemma 4.4 of |GrS07] asserts that if {X p ) is a Bourgain system of dimension 
d and r\ G [0, 1], then \X np \ > (r]/2) d \X p \. Therefore, if X a -< e X p , we know that \X a \ > 
(e/800d) d \X p \. We also obtain a lower bound for the sizes of the sets in a dilated system 
(Yp) = (X vp ), which will be useful to us later. 

The next lemma we state without proof because the proof is almost identical to that of 
Lemma 12.31 (i). 

Lemma 8.3. Let e > 0. Let (A / " p ) < p <4 be a Bourgain system and let < a < p be such 
that X a -< e X p . Let f be any function from TL^ to C such that ||/||oo — 1- Then 

^xgxJ(x) xi e E xeXl) E ye xJ(x + y). 

Obvious candidates for Bourgain systems are families of subgroups, Bohr sets and mul- 
tidimensional arithmetic progressions. For example, given a Bohr set B = B(K,a), the 
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set 

X p = {x £ %n ■ |1 - e{rx/N)\ < pa for all r £ K} 

obviously satisfies the first four of the above properties, and it also satisfies the final one 
with 2 d replaced by 5' x ' (see for example Section 8 of |GrT08] ). Therefore the sets X p can 
be viewed as forming a Bourgain system of dimension d <3\K\. A similar statement holds 
for a family of multidimensional arithmetic progressions with the same basis but differing 
widths. 

We shall not yet explain in detail why Bourgain systems are useful. Instead, we shall 
introduce the Bourgain system we wish to use, prove that it is a Bourgain system, and 
then when we need it to satisfy various properties we shall quote appropriate results that 
tell us that all Bourgain systems have those properties. The proofs are not too hard and 
can be found in |GrS07j . 

Lemma 8.4. Let mi, . . . , ma be positive integers and let P be the set n^ =1 [— m «5 m «] ■ Let 
fx, . . . , be multiplicative Freiman homomorphisms from P to T that take the value 1 at 
0, and for each p £ [0,4] let 

X p = {x £ P : 1 1 - fj (x) | < p for every j < M} 

Then the sets (X p ) form a 2M -dimensional Bourgain system. Moreover, the relative density 
ofX p m P is at least 3~ d (p/2n) M . 

Proof. The first three properties hold trivially. The fourth is almost trivial: the simple 
calculation needed is that if / is a multiplicative homomorphism to T, |1 — f(x)\ < p, and 
\l-f(y)\<p', then 

\l-f(x + y)\<\l- f(x)\ + \f(x) - f(x + y)\ = |1 - f(x)\ + |1 - f(y)\ <p + p'. 

The only real work comes in proving the fifth property, which bounds the size of B 2p in 
terms of the size of B p . The argument here is essentially the same as it is for Bohr sets. 
Let us define a map ip : P — > T d by ijj{x) = (fi(x), . . . , f M (x)). Then ip(X 2p ) C T^, where 
J 2p = { z g C : \z\ = 1, |1 - z\ < 2p}. 

We can cover by four segments of the circle that have diameter at most p. Let us 
use all 4 M possible products of these sets to cover the set T^. If Z is one of these products 
and ip(x) and ip(y) both belong to Z, then ip(x — y) £ T?f , which implies that x — y £ X p , 
or equivalently that x £ y + X p . It follows that if we choose one y for each Z for which 
there exists y with ip(y) £ Z, then we have a system of at most 4 M translates of X p that 
cover X 2p . Therefore, the sets X p form a Bourgain system of dimension 2M, as claimed. 
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Now let us turn to the density estimate, which is proved in a similar way. For each 
z = (z±, . . . ,Zm) C T m , let T^f, 2 (z) be the set of all w = (w±, . . . ,wm) G T m such that 
\z%—Wi\ < p/2 for every i. For any Zi G T, the arc of points Wi G T such that — Wi\ < p/2 
has length at least p, so the density of T p / 2 (z) is at least (p/2n) M . 

Let us write P/2 for the set YYt =1 [-m i /2,m i /2}. Then \P/2\ > 3~ d \P\ (because the 
worst case is when every m; is equal to 1). Hence, by averaging we can find z G T M such 
that ip(x) G T^f, 2 (z) for at least 3~ d (p/27r) A/ |P| points x G P/2. Let x be any such point. 
If y is any other such point, then y — x G P and ip(x) and ip{y) both belong to T^ 2 (z), 
which implies that ip(y — x) = tp(y)ip(x) G T* f , so y — x G X p . Hence X p must contain at 
least 3- d (p/27r) M |P| distinct points. □ 

Let P C P' be a proper generalized progression. We shall need a lemma to tell us that 
we can cover Zat reasonably efficiently with translates of P. The proof is essentially the 
same as the proof of Lemma 17.21 

Lemma 8.5. Let P C Z^v be a proper arithmetic progression of dimension d and density 
7. Then there is a system of at most 2> dr )~ l translates of P that covers Z^v- 

Proof. Let P = {%2i = i a i x i '■ < a« < mi} and let P' = {Yli=i a i x i : < < mi/2}. 
Then let ui, . . . ,um be a maximal set such that the sets P' + Ui are disjoint. Note that 
P'-P' = {J2 d =1 aiXi : < a, < mi/2} C P - £iLmi/ 2 >i- Let z = ^L^/ 2 J^- 

Then the sets P + itj — z form a cover, since for every x there exists it, such that 
(x + P') D («j + P') ^ 0, which implies that x G Wj + P' - P' C u» + P - z. Since P' has 
cardinality at least 3 _d |P| and therefore density at least 3~ d 7, the result is proved. □ 

For the next lemma we shall make use of the concept of "special" quadratic averages, 
which was defined just after the proof of Corollary 17.31 

Corollary 8.6. Let B be a Bohr set of dimension d and let q be a quadratic homomorphism 
defined on B. Let m be a positive integer and let B' -< e /^i B be another Bohr set. Let Q 
be an (e,m) -special quadratic average with base (B',q); in other words, for all but at most 
eN points x G Z^v the restriction of Q to x + B' is equal to the restriction of u qi to x + B' , 
where q\ is one of at most m translates of q. Let P C be a proper generalized arithmetic 
progression of dimension d and density 7 such that 2P — 2P C B' . Then there is a set V 
of size at most 3 d 7 _1 such that for at least (1 — e)N values of x there exists i < m and 
v G V such that x + P — P G v + 2P — P and the restriction of Q to v + 2P — P is equal 
to u q \ 
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Proof. By Lemma |8.5[ there is a set V of size at most 2 d/ y~ 1 such that every x is in v + P 
for some P. If x G v + P then uEx — P,sov + PCx + P — P. Since we assumed that P 
was such that 2P - 2P C P', we find that i + P-Pct) + 2P-Pci + B'. Since Q is 
(e, m)-special, the proportion of x such that the restriction of Q to x + B' is equal to u Qi 
for some i is at least 1 — e. 

Therefore, as claimed, for at least this proportion of x, we have some v G V such that 
x + P — P C f + 2P — P and the restriction of Q to v + 2P — P is equal to a; 9 * for some i. □ 

We now come to the main result of this section. The bound may look somewhat com- 
plicated, so let us draw attention to the one feature of it that is very important to us: that 
the dependence on a is of a power type rather than exponential. It is for this that we have 
put in the work of the last three sections rather than simply applying the local Bogolyubov 
lemma. (The fact that the power depends on d is quite expensive, but it produces a doubly 
exponential bound rather than the tower-type bound that would have resulted if a had 
appeared in the exponent.) 

Lemma 8.7. Let Q be a quadratic average that satisfies all the assumptions of the previous 
lemma and suppose that the rank of Q is at most log(l/a) with respect to P. Let n > 
and let 9 = a 2 r]/8d 2 . Then there is a subprogression P' C P of relative density at least 
(a/8) 2d 6 d and a Bourgain system {X p ) of dimension 2m such that each X p is a subset of 
P' , the relative density of X p is at least 3~ d (p/2n) m inside P' , and for every p and all but 
at most eN values of x, \Q(y) — Q{x)\ < rj + p for every y G x + X p . 

Proof. Corollary 18.61 implies that for at least (1 — e)N values of x there is some v G V 
such that x + P — P C v + 2P — P and the restriction of Q to v + 2P — P is u Qt for 
some translate qi of q. Lemma 18.11 then gives us a progression P' of the density stated, 
and a multiplicative homomorphism fi, such that \Q{y)fi{y) — Q{z)fi{z)\ < 77 whenever 
y, z G v + 2P — P and y — z G P' . In particular, \Q(x)fi(x) — Q(y)fi(y)\ < r\ whenever 
y ex + P' . 

If in addition \l — fi(y — x)\ < p for each fixed i, then 
\Q(x) - Q(y)\ = \Q(x)Mx) - Q(y)fi(x)\ < |Q(x)/,(x) - Q(y)Mv)\ + \Q(y)\\fi(y) - fi(x)\, 
which, using the multiplicative property of fi, equals 

\Q(x)f i (x)-Q(y)f i (y)\ + \f i (y-x)-l\ 
and can therefore be bounded above by rj + p. 
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By Lemma I8.4[ the sets X p = {z G P' : |1 — fi(z)\ < p for every i < m} form a 
Bourgain system of dimension 2m, such that X p has relative density at least 3~ d (p/27r) m 
inside P' . □ 



We have just shown that one special quadratic average Q is roughly invariant under 
convolution by sets X p that come from a certain Bourgain system. We now want to 
obtain a similar statement for a combination Yli=i QiUi of functions with small U 2 dual 
norm. The rough idea is to choose for each function Qi and each function Ui a set from 
a Bourgain system with respect to which it is roughly translation invariant, and then to 
intersect all these sets. We shall use a lemma of Green and Sanders |GrS07] to prove that 
this intersection is reasonably large. 

The next lemma is a standard application of Bogolyubov's method. 

Lemma 8.8. Let f be a function from Z^v to C and suppose that ||/||^2 < T and \\f\\oo — C . 
Let K = {r G Z N : \f(r)\ > p} and let B be the Bohr set B(K,p). Then 

K\f{x + d) - f(x)\ 2 < p 2 C 2 + AT^p 2 /* 

for every d G B. 



Proof. We apply the Fourier inversion formula and split the expectation into two parts in 
the usual manner: 

Ey/(x + d) - f{x)\ 2 = E x \ f{rW {x+d) - Ol 2 < l/>)l V rd - II 2 + 4^ |/>)| 2 , 

r reB r £B 

which is bounded above by 

p^f\\l+^ht / / lp 2/3 < p 2 c 2 +4T^p 2 /* 

as claimed, using the fact that H/H4/3 — ||/||^a- D 

Corollary 8.9. Let Q be a quadratic average and let U be a function such that ||£/||oo ^ C. 
Let X be a set such that for at least (l — e)N values of x G we have \Q(x + d)—Q(x)\ < rj 
for every d G X. Let B be a set such that K x \U(x + d) — U(x)\ 2 < 7 for every d G B. 
Then E x \Q(x + d)U(x + d) - Q(x)U(x)\ 2 < 2rfC 2 + 2 7 + 4eC 2 for every d G BnX. 
Consequently, if S is any subset of B H X and a is the characteristic measure of S , then 
\\QU - (QU) * a\\ 2 < 2 V C + 27 1 / 2 + 2e l / 2 C. 
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Proof. Let d G B R X and let a; G Z^r be such that \Q(x + d) — Q(x)\ < r\ for every d G X. 
Then 

|Q(x + d)U{x + d)- Q{x)U{x)\ < \Q{x + d) - Q{x)\\U{x + d)\ + \Q{x)\\U{x + d) - U{x)\, 

which, by assumption, is at most 

rjC+ \U(x + d) - U(x)\. 

It follows that for every such x and every d G B n X, we have 

|Q(z + + d) - Q(x)U(x)\ 2 < 2rfC 2 + 2\U{x + d) - U(x)\ 2 . 

The proportion of x to which this applies is at least 1 — e, by hypothesis. For all other 
x, we can at least say that \Q(x + d)U(x + d) - Q(x)U(x)\ 2 < 4||C/||^ < AC 2 . The first 
statement follows upon taking expectations. 
Now 

\\QU - (QU) * <rf 2 = E x \E deS Q{x + d)U{x + d) - Q(x)U(x)\ 2 , 
which by Cauchy-Schwarz and the first assertion is bounded above by 

E d&s E x \Q(x + d)U(x + d)- Q(x)U(x)\ 2 < 2t] 2 C 2 + 2 7 + 4eC 2 . 

This proves the second statement. □ 

Recall that the aim of this section is to deal with a sum Yli Q[Ui in which the func- 
tions Ui have small U 2 dual norm and the quadratic averages Q[ have low rank. Putting 
together what we have proved so far enables us to find, for each i, a structured set Si 
with characteristic measure Oi such that {Q'iUj) * Oi is close to Q[Ui in L2. Thus, if we let 
S = Si fl • ■ ■ n Sk then we have a measure a such that (X)i=i Q'i^i) *c is close to Yli=i Q'J^i 
in L 2 . As well as making these steps formal, we shall need to prove a lower bound for the 
size of S\ fl • • • fl Sk- 
in order to do so, we generalize a lemma of Green and Sanders about intersections of 
sets from Bourgain systems. (It appears in a slightly different form in their paper |GrS07] 
as Lemma 4.10.) 

Lemma 8.10. Let (X p ) and (Y p ) be two Bourgain systems in Z^v of dimensions d and 
d' , and let the densities of each X p and Y p be fi p and v p , respectively. Then [X p fl Y p ) 
is a Bourgain system of dimension at most 4(d + d') and X p fl Y p has density at least 
2~3(d+d') jj ip i/ p whenever p < 1. 
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We will need to have a similar lemma for more than two Bourgain systems. We could 
imitate the proof of Green and Sanders for the case of two systems, but for simplicity let 
us just apply their result and obtain a slightly worse bound. 

Corollary 8.11. For i = 1,2, ... ,s, let (X^) be a Bourgain system in of dimension 
di and let Xp have density p p % \ Then the sets Xp fl • • • D Xp form a Bourgain system 
of dimension at most 4s 2 (d\ + • • • + d s ) and have density at least 2 -4s2 ( dlH i ~ d ^p p 1 * > . . . pp 
whenever p < 1. 

Proof. It is enough to prove the result when p — 1, since for smaller p we can take a dilated 
system. This allows us to simplify our notation and write pi for p±\ 

We begin by assuming that s = 2 r for some positive integer r. Then we form a new 
collection of 2 r ~ 1 Bourgain systems by intersecting the old ones in pairs. For instance, 
one of the new systems is (Xp D X^), which has dimension at most A(di + d 2 ), and the 
density of x[ l) n xf ] is at least 2~ 3{dl+d ^ p x p 2 by Lemma HH above. 

Now we pair off the new systems. The dimension of the first system that results 
will be at most 16(c?i + d 2 + d^ + d^) and the density when p = 1 will be at least 
2- 15 i^+ d ^+ d ^p 1 p 2 p 3 p 4 . 

In general, after q stages we have a dimension of at most 4 q (di + • • ■ + d 2q ) and a density 
when p = 1 of at least 2 - ( 49 ™ 1 )( dlH } ~ d ' 2q ^p 1 . . . p 2q , as can easily be checked by induction. 

This proves the result when s is a power of 2, with bounds of s 2 (d\ + ■ ■ • + d s ) and 
2-(s 2 -i)(<iiH vd a ) ^ _ _ _ p or g enera ] S; one can simply take a few more Bourgain systems 

for which every set is equal to Z^v in order to make up their number to the next power 
of 2. □ 

We are about to tackle one of the main results of this section, which will eventually 
allow us to eliminate the low-rank phases from the decomposition when the function / to 
be decomposed has a sufficiently small U 2 norm. Very roughly, we shall find a structured set 
S that is not too small such that when we convolve the low-rank part of the decomposition 
with the characteristic measure a of S, it remains approximately unchanged. Later, we 
shall also show that convolving / and the rest of the decomposition of / by a creates a 
function that is small. From this it follows that the low-rank part of the decomposition is 
small. This will give us the Z^r analogue of Theorem 5.7 in |GW09b] . (The proof has the 
same structure as well, but here the argument is substantially more complicated.) 

The parameters in Proposition 18.121 below are chosen so that the proposition can be 
readily applied to the quadratic averages in the decomposition arising from Theorem 17.51 
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An important feature of the precise statement is that the dimension of the Bourgain system 
(S p i) it produces does not depend on the rank-related quantity a. 

Proposition 8.12. Suppose that a, 5 and £ are positive reals. Let Co = 2 2A , d = (2/5) °, 
C = 4(2/5 2 ) Co and p = (5/2) Co . Let k and m be integers bounded above by 2C/5 2 and 
(5/p) d , respectively. Let e > be at most ((/20kC) 2 and let T = (8/e 8 ) 4d2 (2 20 d 3 5 d /ep) d C . 
For each i = 1,2, ... ,k, let Q[ be a quadratic average with base (B[, qi) of complexity 
at most (d,ep/800d5 d ). Moreover, suppose that each Q\ is an (e, m)- special average, and 
that its rank with respect to some d' -dimensional progression P of density 7' satisfying 
2P — 2P C B' i for each i is at most log(l/a). Suppose further that Yli=i <x> — 2C and 
that Yli=i \\Ui\\jj2 < T. Then there exists a Bourgain system (S' pl ) of dimension at most 
32k 3 (m + 2 33 k 6 T 4 C 2 /( 6 ) such that each S', has density at least 



p' 

d' 2 k f ^4 , v 64fc 3 (m+2 33 fc 6 T 4 C 2 /C 6 ) 



such that 



«( \ ( C P 
2 15 kCd' 2 J \2 27 k 4 CT 2 



for every p' < 1, where cy is the characteristic measure of S' p ,. 

Proof. Let us begin by fixing some i G {1,2, ... ,k}. Let i] = (/(20kC), and set = 
a 2 rj/8d' 2 . First we apply Lemma [8.71 to obtain a subprogression P[ C P of relative density 
at least (a/8) 2d ' 2 8 d ' and a Bourgain system (X$) of dimension at most 2m such that each 
XpV is a subset of P[, the relative density of X\v inside P[ is at least 3~ d ' \p' / '2ir) m , and for 
every p' and for all but at most eiV values of x, we have \Q[(x + y) — Qi(x)\ < r\ + p' for 
all y G Xf. 

Set ^ = ( 3 j (2 15 k 3 T 2 ) , in which case we can check that £ also satisfies 4£C < (/5k. 
Apply Lemma 18.81 with p = £ and C replaced with 2C to find a set Ki of cardinality at 
most (2C7/0 2 such that 

®*\Ui(x + y)- U t (x)\ 2 < A?C 2 + 4T 4 / 3 e 2/3 

for every y in the Bohr set B(Ki,£), which has density at least £( 2C V?) 2 _ From this we can 
create a Bourgain system (A p z )) of dimension at most 3(2C/£) 2 by setting A p ) to be the 
Bohr set B(Ki, //£), in which case the above inequality holds whenever p' < 1 and y G A p l ). 

Note that for any value of p', the function [/, and the sets A p ) and X p ) satisfy the 
hypotheses of Corollary 18.91 More precisely, for any fixed p', Corollary 18.91 with X = X\y , 
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B = A p ), 7 = 4£ 2 C 2 + 4T 4 / 3 ^ 2 / 3 , r] replaced by 77 + pf and C replaced with 2C tells us that 
if S{ is any subset of A p , fl X p , , and <jj is the characteristic measure of Si, then 

WQpi - (Qpi) * (n\\ 2 < 4(77 + p')C + 2(4£ 2 C 2 + 4T 4 / 3 e 2/3 ) 1/2 + Ae^C. 

Our parameters 77, £ and e were chosen so that 

||g^-(g^)*^ll2<CA 

for each i = 1,2, ... , k, provided that p' < Q/ (20fcC). In particular, letting Sp> = D 
XL 1 ) PI • • • fl (A^, fl X p , ), and writing oy for the corresponding characteristic measure, we 
conclude that 

||Q^-Q^* V || 2 <CA 
for each i = 1, 2, . . . , k, and hence that 

fc / k \ 

t=l \i=l / 2 

Unfortunately, since we are placing a restriction on the size of p', the Bourgain system 
(Sp/)o< p '<4 is not quite the one we are looking for. However, we can easily get round this 
problem by rescaling: for each p' e [0,4] let us define S' p , to S p ^/sokC an d let us take the 
Bourgain system (S' p ,) f y e[0i4] . 

It remains to verify the statements about the dimension and density of the sets S',. Recall 
that each set X p ) had relative density 3~ d> \p' /2ir) m with respect to P[. This subprogression 
P[ itself had relative density (a/8) 2d ' 2 d ' with respect to P, and P in turn was assumed to 
have density 7' inside Z^. Therefore, the density of xty inside Z^ is at least 



7p' = 7 



\8J \480kCd' 2 J \2tt) 



The dimension of each X^v was simply 2m, and the dimension of A p % ) at most 12(C/£) 2 . 
Hence by Lemma [8.101 we find that each (A p ) nxi ) is a Bourgain system of dimension at 
most 4(2m + 12(C/0 2 ), and the density of A p ) n xf is at least 2^ 2m+l2 ^ c /^ 2 \ 4{ - c l^ ! 7p ,. 
Finally, by Lemma l8.11[ we establish that the Bourgain system (S p >) = {{A p ) fl I? ) fl 
■ ■ ■ n (A^ n has dimension at most 16A; 3 (2m + 12(C/£) 2 ), and that S p > has den- 

sity at least 2~( 16k3+3k ^ 2m+12 ( c/ Z) 2 )^ k ( c /0 2 ^k^ and therefore the dilated Bourgain sys- 
tem (S',) has dimension at most lQk 3 (2m + 12(C/£) 2 ), and S p , has density at least 
(C/160A;C) 16fc3 ( 2m + 12 ( c /«) 2 ) times the density of Sp>. 
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Revisiting our choice of £, we find that 12(C/£) 2 < 2 u k 6 T i C 2 /( 6 , and hence the di- 
mension of the Bourgain system (S',) satisfies the desired bound. The density of S', is at 
least (^/ 3 20A;L7) 32fc3 ( 2m + 233fc6T4c2 /^)(C 3 /2 15 fc 3 T 2 ) 233fc7T4c2 ^ 6 7 p fc „ which can be simplified and 
bounded below by the quantity given in the statement of the proposition. □ 

Next, we need a technical lemma that we shall use repeatedly in the rest of the paper. 
It states that the rank of a quadratic average does not decrease too much when taken 
with respect to a slightly smaller set. This statement was proved for F™ using a simple 
algebraic argument in |GW09aj . As we have already discussed, arguments that depend 
on dimensions of subspaces do not have direct analogues in Z^r, so instead we shall give 
an analytic proof. If /? is a bilinear form, let us define ap(/3) to be E aA i t b,veP< jJ ^ a ~ a > b ~ b '\ 
and rp(/3) = log ctp . Note that if q is a quadratic function that is defined where it needs 
to be and f3(a,b) = q(a + b) — q(a) — q(b), then r P (f3) = r P (q), so all we are doing is 
attaching the rank of a quadratic function to the associated bilinear function as well. (By 
a "bilinear function" we mean a function that is a Freiman homomorphism in each variable 
separately.) 

Lemma 8.13. Let B' be a Bohr set, let (3 be a bilinear function defined on B' x B' and 
let P and B" be subsets of B' such that 2P - 2P C B' and 2B" - 2B" C B' . Then 

IPnsT 



ap(P) > y — jpi — J "fob"- 

Proof. We shall repeatedly make use of the positivity property of the exponential sum that 
we used to define the rank of a bilinear form. We start by writing 

a P ((3) = E^ yeP ^-^') = E xeP E x , eP \E y€P u^ x - x '^\ 2 = E xeP9l (x), 

where we have written gi(x) = E x ' &P \E y&P u^^ x ~ x '' y ^\ 2 . Note that g± maps into [0,1]. Let 
p = \P fl JB"|/|P|. Then the positivity of g\ implies that 

ap(P) > p E xePnB „ gi (x) = p E x/eP E xePnBII \Ey eP uj p{x - x '' y) \ 2 = p E x , eP g 2 {x'), 

where this time we have written g2{x') = E x£PnP ii\E y€P uj l3 ^ x ~ x ' ,y ^\ 2 '. Again, g 2 is non- 
negative so that 

a P (J3) > p 2 E x , £Pr]B ,,g 2 (x') = p 2 E x ^ PnB „\E y£P u^ x - x '^\ 2 . 

Interchanging summation, the latter expression equals 

p 2 E y!y , eP \E xePnB ,,cj^ x > y -^\ 2 = p 2 E yeP g 3 (y) > p 3 E ye p nB „ g 3 (y) , 
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with gs(y) = E, y > e p\'E, x& p n B"td^^ x ' y ~ y ' S} \ 2 , which is again non-negative. Applying the same 
argument one final time, we see that 

p 3 Ey^pEyePnB^E^pnBnuj^'y-^l 2 = p 3 E y , eP g 4 {y') > p 4 E y , ePnB „ g±{y') , 

where gi(y') = E ye p n B"\E xe p ri B"UJ l3< " x ' y ~ y ^\ 2 is non-negative. We have thus shown that 

a P (P) > p 4 E XtX%yjy , ePnBll u^ x - x '' y -^ = p 4 ap nB »(/3), 

which proves the result. □ 

We shall also need the following lemma from |GW09bj that enables us to take a set of 
not too many quadratic functions and partition it into a "low-rank part" and a "high-rank 
part" in such a way that there is a large gap between the ranks in the two parts. We 
shall present the lemma in a slightly modified form and give the simple proof of the precise 
statement we need. 

Lemma 8.14. Let Rq, b > 2 and t > 1 be constants. For each i = 1, 2, . . . , k, let Qi be 
a quadratic average with base (B'^qi). Then for any P C Hi=i there is a partition of 
{1, 2, . . . , k} into two sets L and H , and a constant R £ [Rq, b k (Ro + t)}, such that the rank 
of Qi with respect to P is at most R for every i £ L and at least bR + t for every % £ H . 

Proof. Without loss of generality the Qi are arranged in increasing order of rank with 
respect to P. If there is no i such that Qi has rank at least W^Rq + 1) with respect to P, 
then let L — {1, 2, . . . , k} and let R = b k (Ro + t) and we are done. 

Otherwise, let i be minimal such that Qi has rank at least b l Ro + (1 + b + ■ ■ ■ + Set 
R = b l_1 Ro + (1 + b + ■ • ■ + b l ~ 2 )t. Then for every j < i the rank of Qj is at most R, and for 
every j > i the rank of Qj is at least bR+t. Since R < b k R +(l+b+- ■ ■+b k ' 1 )t < b k (R +t), 
the lemma is proved. □ 

Lemma 8.15. Let Q be a quadratic average with base (B,q), let B\ -< v B and suppose 
that Q has rank r with respect to a subset P C B\. Let Q' be another quadratic average, 
with base (B', q'), where B' has complexity at most (d, p). Suppose that e and a are positive 
constants such that 

16d 2 a + (II77 + e~ r ) 1/8 (4/a) M2 (800d 2 /p) d < 2e. 
Then if {Q,Q') > 2e, it follows that \\Q'\\u2 < (12a) 1 / 8 . 
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Proof. The basic idea is that if ||Q'||[/2 is not small, then by Theorem 16 . 1 1 we can approx- 
imate it by a quadratic average with smallish U 2 dual norm, which shows that Q' cannot 
after all correlate with Q, which has small U 2 norm. 

More precisely, Theorem 15.21 tells us that ||Q||^ 2 < (lie + e~ r ) 1 ^ 8 . Let a > and 
suppose that ||Q'||r/2 > (12a) 1//8 . Then Theorem 16.101 gives us a function Q" such that 
HQ'-Qloo < lQd 2 a and \\Q'% a < (A/a) 4d2 (800d 2 /p) d . It follows that 

(Q, Q') < WQUQ' - Q"\\oo + \\Q\\uA\Q"\\h < ^d 2 a + (He + e- r f'\A/a) 4d \mM 2 / p)\ 
which we are assuming to be at most 2e. This proves the lemma. □ 

It turns out that we need to look some distance ahead in order to determine with respect 
to what sort of substructure we would like our quadratic averages to have large rank. So for 
the time being our choice of substructure will look rather arbitrary. For further justification 
the reader may wish to consult the proof of Proposition 110.21 a few pages further along. 

It may help if we point out that the unpleasant bound for c in the theorem below is 
exponential in Rq and doubly exponential in 5. This, rather than the precise form of the 
bound, is what mainly matters to us. 

Theorem 8.16. Let C = 2 24 , let 5 > and let C = 4(2/5 2 ) Co . Let f : Z N C be 
a function such that \\fW2 < 1 and let R be a positive real number. Let d = (2/5) Co , 
p = (5/2) Co , lete>0 be bounded above by 5 6 /2 12 C 5 , let T = (8/e 8 ) 4d2 (2 20 d 3 5 d /ep) d C and 
let c > be at most 

\2^ k d^ d C 

where 

5 5 ep \ 64fc 3 (m+(i 2 +2 33 fc 6 T 4 C 2 /<5 6 ) 

2 54 W5 d C7 2 T 2 ) 

Let f be any function such that \\f\\u 2 < c - Then f has a decomposition of the form 

k 

fXx) = j2Q^n(x)+g(x) + h(x), 

i=l 

where k < 2C/S 2 and the Q\ are quadratic averages on Zjy with base (B' i ,q i ) and of com- 
plexity at most [d,ep/800d5 d ) ; such that £* =1 \\Ui\\* u2 < T, £* =1 \U l \ OQ < 2C, ||#||i < 105 
and \\h\\i/3 < 25. Moreover, each quadratic average Q\ is (e, m) -special for m < (5/p) d , 
and there exists a proper generalized arithmetic progression P inside B' = f\=i B[ of di- 
mension d' < kd and density 7' > (ep/2 12 d'd5 d ) d ' , such that each Q[ has rank at least Rq 
with respect to P. 




$ = $(5, e) 
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Proof. Because e < 5 6 /2 12 C 5 , it is also at most (S/2) 5Co and therefore satisfies the hypoth- 
esis of Theorem 17.51 We deduce that / has a decomposition of the form 

k 

f(x) = Y,Q^)Ui(x)+g\x) + h\x) 1 
1=1 

with the following properties: k < 2C/5 2 , the Q\ are quadratic averages on Z N with base 
{B' t , qi ) and of complexity at most (d, ep/800d5 d ), J^Li \\ U i\\h ^ T ' £*U II^IU < 2C, 
ll^'lli < 35 and ||/i'||[/3 < 5. Moreover, each average Q\ is (e, m)-special for m < (5/p) d . 

By a lemma of Ruzsa |R94] (see also |N96j ) there is a proper generalized arithmetic 
progression P C B' with the properties claimed in the theorem. (The additional factor 
of 1/4 in the density of this progression arises from the requirement that 2P — 2P C B' i7 
which we need in order to be able to talk about the rank of the quadratic average with 
respect to P.) Let us assume that the quadratic averages are arranged in increasing order 
of rank with respect to P. 

Applying Lemma [8.141 with b = 2 13 d 5 k 3 and 

11,3, f2 3( - k+15 UW 

we obtain positive integers R G [Rq, b k (Ro + £)] and s £ {0,l,...,fe} such that Q[ has rank 
at most R when i < s and rank at least bR + 1 when i > s. We collect together the low- 
and high-rank quadratic phases by setting fi = Yli=i Qi^i and fn = Yli= s +i Qfti- 

Because e < S 6 /2 12 C 5 and k < 2C/S 2 , we also have e < (S/20kC) 2 , so it satisfies the 
hypothesis of Proposition 18.121 with ( = 5. Setting log(l/a) = R in Proposition 18.121 we 
obtain a Bourgain system (S' pl ) of dimension at most 32/c 3 (m + 2 33 /c 6 T 4 C 2 /5 6 ) such that 
— II * cr||2 < 5, where a is the characteristic measure of S'J. That proposition also gives 
us a lower bound for the density 7 of S[ of 



( <* A Sep V* ( 



d 2 k 3 / £4 \ 64fc 3 (m+2 33 fc 6 T 4 C 2 /<5 6 ) 



\2 27 k A d A h d C) \2 27 k 4 CT 2 , 
Now let us reconsider our original decomposition / = fi + fn + 9' + h'. We shall convolve 
this equation with the measure a on both sides. We shall show that all of / * a, fn * o~, 
g' * a and h' * o are small and we have already seen that J'l * o ~ /l- From this it will 
follow that /x is small enough to be absorbed into the error terms. 

Let us deal with the easy parts first. Since ||cr||i = 1 and the L\ norm is translation 
invariant, the triangle inequality implies that cr ||i < \\g'\\i < 35. Similarly, ||/i'*<5||;y3 < 
S, since the U 3 norm is also translation invariant. 
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Next, let us estimate ||/ * <r||i. The Cauchy-Schwarz inequality (applied to the Fourier 
transform, though a direct argument is also possible) gives us that ||/ * cr||i < ||/ * cr 1 1 2 < 
II / II u 2 II a lit/ 2 - But we are assuming that ||/||a 2 — c an d we know that 11(711(72 < ||cr||oo = 7 _1 - 
Thus provided that c < 8j, we obtain the bound ||/ * cr|| 2 < 5. This gives us the upper 
bound that c will be required to satisfy for the theorem to hold. 

Our one remaining task is to show that \\fn * cr||i is small (when the parameters are 
appropriately chosen). This is significantly harder, and we shall need to use Lemma [8.151 

Recall first that each quadratic average Q\ that appears in fu has base (B^, qi) and 
rank at least bR + t with respect to the progression P C B' . We also recall from the 
proof of Theorem 17.51 that Q-fZj(x) can be written as $2jeA< ^jQj( x ) + wnere ll^lk — 
(2e + 8d 2 e 8 ) YljeAt l-\/l- The functions Qj are quadratic averages with base (Bj,qj) and 
complexity at most (d, p). Let f' H (x) = J2i> s J2jeAi ^jQj( x )- We have 

WfH * CjWx < \\f H * + \\(f H - f H ) * < \\f H * + J2 E l A ^ 2e + 8A§ )- 

i>s jeAi 

The latter term was shown to be at most 8 in the proof of Theorem 17.51 It follows that 
II /h * c ||i < WfH * + <5; an d thus it suffices to estimate \\f' H * <t||i. In fact, we shall 
obtain an upper bound for \\f' H * a||2. 

By the Cauchy-Schwarz inequality as used on ||/z, * er||2 earlier we have 

\\f' H **h< \\f' H \\u4<r\\u> < 7" 1 E E MWQiWu* 

i>s jeAi 

with Y2i> s Y2jeA I'M — C ■ In order to prove that \\f' h * cr||i < 8 it will therefore be enough 
to show that each Qj has U 2 norm at most 8^jC. To do this, we extract further information 
from the proof of Theorem 17.51 It tells us that there is another quadratic average Q'[ with 
the same base {B[, g,) as Q\ such that (Q", Qj) > 2e. Since Q'[ has the same high rank as 
Q\ and correlates with Qj, we are in a position to apply Lemma [8. 151 
To do this, we set Q = Q", B = B[ and q = qi. We shall let 

77 \4J \800d 2 ) \ACJ 

and we shall take B' to be a Bohr subset B" of B[ such that B" -< v B[. We then take Q' 
to be Qj, remarking that Qj has base (Bj,qj) for some Bj of complexity at most {d,p). 

We shall take the set P in Lemma 18.151 to be the set P fl B" here. We now need a lower 
bound for the rank of Q = Q" with respect to P fl B", or equivalently an upper bound 
for the quantity apn^'(Q)- Lemma [8.131 tells us that apoBf (Q) < /3 -4 ol p < f3~^e~^ hR+t \ 



48 



W.T. GOWERS AND J. WOLF 



where /3 = \P n 5f|/|P| is the relative density of P n -B" in P. By Lemma 18.101 we find 
that /3 is at least 2~ 3{kd+3d ^ times the density of B'/, so /3 > (rjep/2 3ik+w W 2 5 d ) d . Therefore, 
we can take e~ r in Lemma 18.151 to be /3~*e~( bR+t > with this value of (3. It can now be 
checked (the checking, though painful, is routine) that if we take a = (57/C) 8 /12, then 
the conditions for Lemma [8.151 are satisfied. Therefore, by that lemma, ||Qj||[/ 2 < S^y/C. 

This completes the proof that \\fn * < 25. We have therefore demonstrated that it 
is possible to write fi as a sum g" + h" with ||g"||i < 75 and ||/i"||;y3 < 5. It follows that / 
has a decomposition / = fu + g + h with Ugl^ < 105 and ||/i||[/3 < 25 as claimed. Finally, 
we remark that the rank R was at most b h (Ro + t), a condition which we insert into our 
bound for the uniformity parameter c to obtain the theorem as stated. □ 

9. Some facts about ranks of quadratic and bilinear functions on Bohr 

SETS 

In Fp, it was more or less self-evident that the rank of the sum of two quadratic forms was 
bounded above by the sum of the individual ranks. Such subadditivity, even in approximate 
form, is no longer a trivial statement for forms of higher degree such as those in |GW09c] . 
and, as it turns out, for the locally defined quadratic forms that we are dealing with in 
this paper. Here we shall use regular sets from Bourgain systems to adapt the analytic 
proof of subadditivity for F" given in |GW09c] to Z N . The reader may wish to consult the 
finite-fields argument in that paper before embarking on this section. 

The following standard identity is the key ingredient in the proof of subadditivity. 

Lemma 9.1. Let B C r L N and let (3 : B 2 — )■ Z^v be a bilinear function and let f(x,y) = 
u P(x,y). Then 

f(a — a',b — b') = f(x + a,y + b)f(x + a,y + b')f(x + a',y + b)f(x + a' ,y + b') 
provided that all of a — a 1 , b — b' , x + a, x + a' , y + b, y + b' lie in B. 

Proof. This follows immediately from the identity 

f3(a — a' ,b — b') = f3(x + a,y + b) — (3(x + a' ,y + b) — f3(x + a,y + b') + (3{x + a' ,y + b'), 
which can easily be checked by hand. □ 

Lemma 9.2. Let B,B' be two sets from a Bourgain system and suppose that B' -< e B. 
Write it for the characteristic measure of B. Then for every s £ B' and every function 
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j : 7L N — > C with \\j\\oo < 1, we /iave 

Eue^Tr * ir(u)j{u + s) ^ e E ugZjv 7r * iz{u)j{u). 
In particular, it follows immediately that for any A C B' , 

E uGZjv 7r * ir{u)j(u) ^ £ E u6Zjv E seA 7r * n(u)j(u + s). 

Proof. We estimate the difference between the left- and right-hand side above by expanding 
out the convolution and using the triangle inequality. 

|E ueZjv 7r * ir(u)j(u + s) - ir * ir(u)j(u)\ = \E UtZeZN ir(z)ir(u - z)(j(u + s) -j(u))\ 

< E zeZN ir(z)\E u€ZN ir(u)(j z (u + s) - j z (u))\ 
= E z&ZN Tr(z)\E ueB j z (u + s) -j z (u)\, 

where j z (u) — j(u+ z) for all u. The inner expectation is at most e for every s G B' by 
Lemma 18.31 □ 

We now apply Lemma 19.21 to derive an inequality reminiscent of the usual lemmas that 
say that a function behaves quasirandomly if its U 2 norm is small. However, our inequality 
concerns a "local" version of the U 2 norm. Given two sets B' -< e B from a Bourgain system 
and a function h : — >■ C, we shall define \\h\\ U 2^ B+BB /- ) by the formula 

\\ h \\u\B+B,B>) =^x,yK * 7T(x)7r * 7r(y) 

^a,a',b,b'eB'h(x + a,y + b)h(x + a',y + + a, ?/ + + a', y + 6'), 

where 7r is the characteristic measure of B. 

Lemma 9.3. Let e > 0, let B' -< e B be a regular Bourgain pair and write tt for the 
characteristic measure of B. Then for any function h : (Ztv) 2 — > C with ||/i||oo — 1, we 
have the estimate 

\E X:y€ZN 7r * vr(x)7r * Tr(y)h(x 1 y)\ 4 < \\h\\^ B+B)P) + 6e. 

Proof. The Cauchy-Schwarz inequality implies that 

\E Xty&N ir * tx(x)ti * n(y)h(x, y) | 4 < \E x&N n * ir(x) \E y&N n * n(y)h(x, y) | 2 | 2 . 
Lemma 19.21 tells us that 

E ye z N ir * ir(y)h(x, y) ^ t E y6Zjv E bei3 /7r * n(y)h(x, y + b) 
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for every x, from which it follows that 

|E. yeZjv 7r * n(y)h(x, y)\ 2 \E yeZN ¥. beB >7r * n(y)h(x, y + b)\ 2 . 
From this it follows that 

\^x&z N n*ir(x)\E yeZN 7r*7r(y)h(x,y)\ 2 \ 2 < \E x&ZN ir*ii(x)\E yeZN E beB ,7r*n(y)h(x, y+6)| 2 | 2 +4e. 

(For these last two approximations we have used the fact that if a ^ e b and a and b 
both have modulus at most 1, then a 2 ~2 e b 2 , which follows from the fact that a 2 — b 2 = 
(a + b)(a — b).) By the Cauchy-Schwarz inequality, 

\E x&N n * 7r(a;)|E yeZjv E feej B/7r * n(y)h(x,y + b)\ 2 \ 2 
< \E x&N 7r * n(x)E y&N 7i * ir(y)\E beB ,h(x, y + 6)| 2 | 2 



= \E yeZN 7T * 7r(y)E b ^ eB/ E xG i N TT * n(x)h(x,y + b)h(x,y + b')\ . 
Applying Lemma 19.21 in a similar way a second time, we see that this is at most 



\E y&N Ti * ii{y)E bM£BI E x£ZN E aeB tiT * 7r(x)h(x + a,y + b)h(x + a,y + b')\ 2 + 2e 
< E Xty&N -K * n(x)n * n(y)E btb , €B >\E aeB/ h(x + a,y + b)h(x + a,y + b')\ 2 + 2e, 

which equals ||^||^2( B+B B i\ + 2e. This proves the lemma. □ 



Finally, we need to exploit regularity once more to be able to shift our variables at a 
certain point in the proof. We isolate the lemma, which is very similar to Lemma 19. 2\ in 
order to keep the proof of the main result tidy. 



Lemma 9.4. Let B and B' be sets from a Bourgain system with B' -< £ B, and write it 
for the characteristic measure of B. Write cr(x) = tt * n(x), p for the density of B and let 
j : Z^v — > C be an arbitrary function with \\j\\oo — 1- Then for any a G B' , 

E x&Ln o{x + a) 2 j(x) ^2e/ P E xeZN a(x) 2 j(x). 
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Proof. As usual, we shall attempt to bound the difference between the two sides in absolute 
value. 

\E x (a{x + a) 2 - a{x) 2 )j{x)\ = \E x (a(x + a) + a(x))(a(x + a) - a(x))j(x)\ 

= \E x (a(x + a) + a(x))j(x)K v ir(v)(ir(x + a — v) — tt(x — v))\ 

< 2p~ l E x \E v -k(v)(ix(x + a-v)- ir(x - v))\ 

< 2p~ l E„7r(t>)E x |7r(x + a — v) — ix(x — v)\ 

The expression \ir(x + a — v) — it(x — v) \ is non-zero if and only if x G (v + B) A (v — a + B), 
which by regularity assumptions is the case for at most e\B\ values of x. The non-zero 
value taken is and we conclude that E z |7r(x + a — v) — n(x — v)\ < e. The lemma 
follows. □ 

We are now fully prepared to prove subadditivity. We remind the reader that ap(f3) = 
^a,a , ,b,b , £pu l3 ( a ~ a '' b ~ b '\ and rp(/3) = logaj, 1 for any bilinear form j3 defined on a set that 
contains P — P. 

Lemma 9.5. Let 0i and 02 be bilinear forms defined on a Bohr set B, and let (B p ) be a 
Bourgain system of dimension d such that Bi has density 7 < 1/2 and 1B\ — 2Bi C B. 
Let e > 0. Then 

(ocbMocbM) 4 < l~\^d/ef d a Bl {l3 l + fa) + 9e 7 ~ 7 . 

Proof. Let B' -< e B 1} write 7' for the density of B', and note that 7' > (e/800<i) d 7. We 
shall begin to prove the subadditivity statement by considering the expression 

OBi (AO = ^x,x',y,y>eBif{x ~x',y- y')E u ^ u , ,v,v> eB^i^ - U, V - v') 

= E XtVeZtf n * tx(x)ix * n(y)f(x, y)E u ^ &N Ti * ir(u)ir * n(v)g(u, v), 

where ir is the characteristic measure of B\. Shifting two of the variables, we obtain 

^x, v &l n k * tt(x)tv * ir(y)f(x, y)E U) „ eZjv 7r * ir(x + u)n * ir(y + v)g(x +u,y + v). 

Writing o = n * ir, we apply Holder's inequality (or the Cauchy-Schwarz inequality twice) 
to show that 

(a Bl (/3i)a Bl (/3 2 )) 4 < E UjV \E x :V a(x) a (y)f(x, y)a(x + u)a{y + v)g(x + u, y + v) | 4 
= E U:V \E Xiy a(x)a(y)h UjV (x,y)\ 4 
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where we have set h u>v (x, y) = f(x, y)g(x + u,y + v)a(x + u)a(y + v). For every fixed value 
of u and v, we shall apply Lemma [9.31 From this, we deduce that 

(a B (/3i)a B (/3 2 )) 4 < ^u,vE x , y a(x)a(y)E ata ,^ b , eB/ h UtV (x + a,y + b) 

h UtV (x + a',y + b)h UjV (x + a,y + b')h UjV (x + a' \y + b') + 12e. 

(We have omitted the condition e < 1/3 from the statement of this lemma since if e > 1/3 
then the lemma holds trivially.) Next, we expand out h U;V , which replaces the right-hand 
side by 

E UjV E Xty a(x)cr(y)E aja , jb>b , eB ,f(x + a,y + b)f(x + a',y + b)f(x + a,y + b')f{x + a',y + b') 

g(x + u + a,y + v + b)g(x + u + a',y + v + b)g(x + u + a,y + v + b')g(x + u + a',y + v + b') 
a(x + u + a) 2 a(x + u + a') 2 a(y + v + b) 2 a(y + v + b') 2 + 12e. 

Setting x' = x + u,y' = y + v, we can rewrite this expression as 
E x> y jX i >y 'cr{x)(T(y)E a!a , tbtb , eB ,f(x + a,y + b)f(x + a',y + b)f(x + a,y + b')f(x + a,y + b') 
g(x' + a,y' + b)g(x' + a', y> + b)g{x> + a,y' + b')g(x + a, y' + b') 
a(x' + a) 2 a(x' + a') 2 a(y' + b) 2 a(y' + b') 2 + 12e. 

Lemma 19.11 tells us that this expression is equal to 

E XjVtX > ,y>a(x)a(y)E aia/ )b>b ^ B 'a(x' '+a) 2 a(x' '+a') 2 o{y' '+b) 2 a(y'+b') 2 f(a-a' ,b-b')g(a-a\b-b') + 12e. 
Since E x a(x) = 1, this in turn equals 

K,af,b,VeB'f(a -a',b- b')g{a -a',b- b')E x >^a(x' + afa{x' + a') 2 a(y' + bfa{y' + b') 2 + 12e. 

We would like to be able to evaluate the inner expectation independently of the choice 
of a, a', b, b' . We cannot do this exactly, but Lemma 19.41 tells us that a is approximately 
translation invariant, so we can do it if we introduce a small error. For instance, if we apply 
it to the first occurrence of the function cr 2 and let j(x') = 7 6 a(x' + a') 2 a(y' + b) 2 a(y + b') 2 , 
then Hj'IIoo < 1, so we find that 

E x , ty ,a(x'+a) 2 a(x'+a') 2 a(y'+b) 2 a(y'+b') 2 < E x/ ^a(x') 2 a(x'+a') 2 a(y'+b) 2 ( x(y'+b') 2 +2 1 - 7 e. 
Applying the lemma three more times in this way, we find that 

(a Bl (/3i)a Bl (/3 2 )) 4 < E a ^ AbleB/ f(a-a',b-b')g(a-a',b-b')E x ,a 4 (x')E y/ a\y')+8^ 7 e+12e. 
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But since E a ,a', femes' /( a — a',b — b')g(a — a',b — b') = a B i(fa + fa) an d IE x a 4 (x) < 7~ 3 , we 
have shown that 

(a Bl ((3i)a Bl ([3 2 )) 4 < 7~V'(A + fa) + §7~ 7 e + 12e. 

Unfortunately the exponential sum on the right-hand side is taken over B', or we would 
be done. But we can remedy this situation by applying Lemma 18.131 which implies that 
a B> < {l /i') A ol Bi . Therefore, 

{oc Bl {fa)a Bl ^ 2 )f < 7- 2 y- 4 « Bl (/3i + fa) + 8 7 " 7 e + 12e. 

The result follows from the lower bound for 7' mentioned at the beginning of the proof. □ 

We need a slight generalization of Lemma [931 to be able to sum arbitrarily many bilinear 
forms. In fact, we shall not use Lemma [9751 as stated to carry out the induction, but rather 
the main intermediate result in the proof above that related the rank of f3\ + /?2 with respect 
to B' to the individual ranks with respect to B\. 

Lemma 9.6. Let e > 0. For % — 1, 2, . . . , m, let be a bilinear form defined on a set B, 
and let (B p ) be a Bourgain system of dimension d such that Bi has density 7 < 1/8 and 
2B 1 -2B X QB. Then 

m m 

la B (f3i) < 1 - 2m \800d/6) dlo ^ m2 a B (Y,Pi)^ 

i=l i=l 

Proof. Let us start off by considering the case when m = 2 s . Let B s+1 -< e B s -< e ... 
B2 -< e B\ be a sequence of sets from the Bourgain system (B p ). (Thus, the indices do not 
indicate values of p.) We shall prove that 

n« Bl (A) < A* a a B3+1 (J2^ a +4a^A 4S , 
i=i i=i 

where A = 7 -3 / 2 and a = 27 _2 e 1 ^ 4 , and proceed by induction on s. The case where s — 1 
is guaranteed by the proof of Lemma 19.51 Indeed, before we switched from B' back to B\ , 
the inequality we had implied that 

u Bl {fa)u B ,{fa) < Aa B >{fa + faf /A + «, 

on the assumption that B' -< £ B\. If we take B' = B2, then this is in fact stronger than 
the case s = 1 of this lemma. 
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Suppose now that the statement is true for s, and consider 

< (A* a a Bs+1 C£fr) 1/4S +4a 1/4a An(A* a a Bs+1 ( £ A) 1/4S + 4a^ 4 

i=l i=l j=2 s +l 

2 s 2 S+1 



< A 2 - 4 > Bs+1 (^AW +1 ( £ A)) 1/4S +8a^A 2 ' 4S + 16a 2 / 4S A 2 



|2-4 S 

~>i)j ' i- oa ' 74. -|- iuu ' ^ 
i=l «=2 S +1 

from which it follows by the strengthened version of the s = 1 case noted above that 

i=l i=l 

< A 2 - AS+1 ' AS a Ba+2 {Y, A) 1/4S+1 + y4 2 ' 4S a 1/4S + 8a 1 / 4 * A 2 ' 4 * + 16a 2/4S y4 2 ' 4S . 

i=l 

It is easily checked that this expression is bounded above by 

A 4S+1 a Ba+2 C£ A) 17 '" 1 + 4« 1/4S+1 A 4S+1 
i=i 

as claimed, provided that 7 < 1/8. This concludes the inductive step. To complete the 
proof, we apply Lemma 18.131 to obtain a statement about the rank with respect to B\ . It 
tells us that 

2 s 2 s 2 s 

i=l i=l i=l 

with \B s+l \ > (e/800d) d \B s \ > ... > (e/800d) a<i |.Bi|. It follows that 

2 s 2 s 

f[a B M) < A iS (800d/e) sd ^ aBl C£fr) 1/AS + ^ s A* s . 

i=l i=l 

For general m, note that we can add in bilinear forms that are identically zero without 
affecting the argument. □ 

Next we state and prove a modified version of Lemma 6.3 and Corollary 6.4 from 
[GW09bj . This is the first and only time we make use of the assumption that our sys- 
tem of linear forms is square independent. 

Lemma 9.7. Let e > 0. Suppose that Li(x) = J2t=i c iu x u, i = 1,2, ... ,m, is a square- 
independent system. Suppose that each of the (not necessarily distinct) bilinear forms Pi, 
i = 1,2, ... ,m is defined on a Bohr set B, and that (B p ) is a Bourgain system of dimension 
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d such that Bi has density 7 and 2B\ — 2Bi C B. Then there exists a pair (u,v) £ [d] 2 
such that the bilinear form f3 uv = Y^iLi c iuCi V (3i satisfies 

a Bl (Puv) < T-^tSOOrf/e)^ ^/" 13 ^^/?,) 1 /™ 3 +4 7 - 2m (e 1 / 4 / 7 2 ) 1/m3 
for any % — 1, 2, . . . , m. 

Proof. For each i — 1, 2, . . . , m, let Mj be the (dxd) matrix (cj U Cj W ) W)W . Square independence 
implies that the matrices Mj are linearly independent. It follows that the rank of the d 2 x m 
matrix whose ((it, u), i) entry is Ci U Ci V is m. The rows of this matrix are the (d x d) matrices 
Mi, . . . , M m . The columns are the vectors C uv = (ci u Ci v , C2 U C2v, ■ ■ ■ , c m uCmv)- Since row 
rank equals column rank, we can find m linearly independent vectors C uv . We have just 
shown that there is a collection of m forms rjj = Y^T=\ Bijfii for an invertible matrix B, so 
we can write fa = B^ rjj. But in this situation Lemma [9.61 tells us that 

(mma Bl ( Vj )) m < Y\a Bl ( Vj ) < Aa Bl {fa) x l m * + a, 
i 

where we have written A = 7 - 2m2 (800d/e) a!logmAn2 and a = 8-f~ 2m2 ''(e 1 ' 4 /7 2 ) 1 /™ 2 . There- 
fore, there exists an index j such that a Bl (rjj) < A 1 / m a Bl (j3i) 1 / m ' i + a x l m . But rjj equals 
(3 UV for some pair (u,v) £ [d] 2 . □ 

We continue by proving a lemma that says that high-rank bilinear phase functions defined 
on Bohr sets are quasirandom in the following sense: they do not correlate well with 
products of functions of one variable. 

Lemma 9.8. Let e > and let B and B' be part of a Bourgain system such that B' -< 6 B. 
Let (3 be a bilinear form defined on B 2 , and suppose that P C B' . Let g and h be two 
functions with \\g\\oo an d \\h\\oo a t most 1. Then 

\E XjyeB u^g(x)h(y)\ < (a P ((3) + Qe) l '\ 

Proof. We have 

K >yeB u^g(x)h(y)\ 4 < {E xeB \E yeB u^h(y)\ 2 ) 2 , 

which, by Lemma 12.31 (ii) and the difference-of-squares argument used in the proof of 
Lemma 19.31 is to within 4e equal to 

(E xeB \E yeB -E zeP co^y + ^h(y + z)\ 2 ) 2 . 
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By the Cauchy-Schwarz inequality this is in turn bounded above by 

(E x£B E y£B - \E zeP u^ + %(y + z)\ 2 ) 2 . 

Expanding out the inner square and applying the triangle inequality, we can bound this 
above by 

{E yeB -E z , z r eP \E xeB u^ z - z '\) 2 . 
The inner sum is to within e equal to E x£B - ^ weP io^^ x+w ' z ~ z '\ so our next upper bound is 

(E yeB -E ZjZ , eP \E xeB - !W€P u^ +w ' z -^\) 2 + 2e. 

Another application of Cauchy-Schwarz shows that this is at most 

E x>yeB -E ZtZ , eP \E weP cu^ z -^\ 2 + 2e = E WtW ,^ z , eP u^ w - w '> z ~ z "> + 2e. 

We recognize the first part of this expression as the definition of a P (/3). This proves the 
result. □ 

10. Computing with linear combinations of high-rank quadratic averages 

We are now in a position to perform the computation over the structured parts of our 
decompositions, which will be a key ingredient in the proof of the main result of this paper. 
The next lemma is very straightforward and will help us keep the proof of the subsequent 
computation as tidy as possible. 

Lemma 10.1. For each j = 1,2, ... ,r, let gj and g'j be arbitrary functions on Z S N . Let 
G = maxj Ha lloo; G' = max^ H^Hoo an< ^ ^ = m & x {G, G'}. Then 

r r 

l&xeA Yl 9j(x) - E xeA Y[ g'j(x) 
j=i i=i 

is bounded in absolute value by 

(i) rC r max, \\gj - g'jh if A = Z S N or 

(ii) rC r max, \\gj - g'jWoo ifACZ s N . 

Proof. In both cases the bound stated follows from the observation that 

r r r 

n g 3 ( x ) ~ n 9 j & = yi n ^ ^ ^ ^ ~ ^ ^ ) n & ^ ■ 

j=l j=l j=l i<j i>j 

When A = Z N , this actually implies a stronger upper bound of rC r max,,- \\gj — g'j\\i, 
though we shall only need the L 2 bound. For general A the above identity implies that 
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| Ylj 9j( x ) ~~ Ylj 9j( x )\ — rC r maXj \\gj — g'j\\oo for every x, so it holds for the average over 
x over any set A. □ 

The next result has a long and complicated-looking proof. However, much of the com- 
plication is due to the need to keep track of ever more elaborate parameters as we apply 
the estimates of the preceding sections. So let us first give a qualitative discussion of the 
argument, to try to indicate what the underlying ideas are. 

Recall that our ultimate aim is to obtain a small upper bound for the quantity 

r 
i=l 

when the linear forms Li are square independent and ||/||c/2 is sufficiently small. The basic 
idea behind the proof is to decompose / as a sum of the form £\ QiUi + g + h, where the 
Qi are generalized quadratic averages, the Ui are functions with small U 2 dual norm, g has 
small L\ norm and h has small U 3 norm, and then to substitute this expression in for / 
and do the computations. 

If that were all there was to it, then this paper would be much shorter than it is. 
However, replacing the r occurrences of / by quadratic averages in the above expression 
does not give a small result unless those averages have high rank. So a major task was to 
show, using the hypothesis that ||/||t/2 is small, that the decomposition could be made into 
high-rank averages. In the previous section, we proved that high-rank averages do indeed 
lead to small results. 

There is one further difficulty, however. The most obvious thing to do at this stage 
would be to substitute ^ QiUi + g + h for each occurrence of /, with every Qi of high 
rank. This would give us a big collection of terms to deal with. But not all of them would 
be small. For example, if we take g from every bracket, we obtain a term that has no 
reason to be small: the fact that ||g||i is small is no guarantee that 

r 
i=l 

is small. 

Instead, we do something slightly different. We first decompose just one copy of /, 
obtaining an expression of the form 

r-l 

®xe(z N )° ]lf(Li(x))(Y^Q?\Lr(x))Uf\L r (x)) + g r (L r (x)) + h r (L r (x))). 

i=l i 
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The effects of the g r and h r terms are now small: to deal with the h r term (which has small 
U 3 norm) we use a lemma of Green and Tao (Lemma 111.21 below), and to deal with the 
g r term we use the fact that it has small L\ norm and the rest of the product is bounded. 
Thus, we can approximate the above expression by 

r-l 

®xe(z N )* n /(Li(z))/ r (L r (s)), 
i=i 

where f r (x) = ^2iQi(x)U- r \x). At this stage, we would like to repeat the process with 
the (r — l)st copy of /, but we have a much worse bound for ||/ r ||oo than we had for ||/||oo, 
so we have to choose a new decomposition / = £\ Q < f~ 1 ^U- r ~ 1 ^ + g r _i + /i r _i in such a 
way that ||gv- i||i||/r||oo is small (and not just ||g r _i||i). And then we continue the process. 

This explains why Proposition 110.21 below concerns r different functions and r different 
decompositions. Once we have these decompositions, then the above argument is a sketch 
proof that we can ignore all the error terms and just concentrate on the terms involving 
high-rank quadratic averages, which is what we do in the proposition. So our problem is 
now reduced to obtaining an upper bound for the size of terms of the form 

r 

JiQiUtXLiix)), 

when all the Qi have high rank and the Ui have not too large U 2 dual norm, since the 
expression we are left wishing to estimate is a sum of a bounded number of terms of this 
form. 

The next complication (or rather, apparent complication, since we have the tools to deal 
with it) is that the Qi will have different bases and the high ranks will be with respect 
to different sets. All we really have to do in order to deal with that kind of problem is 
intersect everything. We know that sets from Bourgain systems have intersections that are 
not too small, and will use that fact repeatedly. 

The rough idea for dealing with a term of the above form is to find a set D such that 
for every i the functions Ui(x) and Ui(x + y) are close in L2 for every y e D. This we do 
by finding one such set for each Ui and intersecting those sets. And for that we use the 
fact that || Ui ||^ 2 is small for each i. Once we have done that, we use Lemma [8. 131 to argue 
that our quadratic averages Qi still have high rank with respect to a generalized arithmetic 
progression sitting inside D. We then split the average we are trying to estimate into an 
average of averages taken over translates of D, which allows us to assume (after allowing 
for a small error) that the Ui are constant. At this point we are doing a calculation that 
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just involves high-rank quadratic functions on translates of D. The sort of expression we 
want to bound is 

r 

^xiezi+D^x 2 €z 2 +D ■ ■ ■ ^x s £z s +D Qj{Lj{x)). 

3=1 

If we expand out terms such as Qj(Lj(x)), then we obtain sums that involve bilinear 
functions, at which point we use Lemmas 19.71 and 19.81 to show that there is always a 
high-rank bilinear function involved, and therefore that the corresponding terms are small. 
Now let us do the argument in detail. 

Proposition 10.2. Let e,9 > 0. For each j = 1,2,..., r, let fa = YaUQ'^U? be a 
linear combination of (e,rrij)- special quadratic averages with bases (B'^,q^) on 7*^, each 
of complexity at most (dj, ejPj/800dj5 d: >). Suppose further that each Q'^ is of rank Rj with 
respect to some generalized arithmetic progression 

pU) c B'® = B[ {j) of dimension 



dj < kjdj and density 7^ and that J2Zi II u\ j) \\ 00 < 2Cj and Y^Li W^W*^ < T, 



3 ■ 



Set C = maxj Cj, T = maXj Tj, R = miiij Rj, d = maxj dj, k = maXj kj , 7' = min.,- 7^ 
and p = min, pj. Finally, suppose that r(2kC) r e < 9. 

Let L\,...,L r be a square independent system of r forms in s variables, and set M = 

max i Yf u =i l c i«l- Then 



3=1 



< 59 + X e- R/4r \ 



where 

X = xM)=( -^-^ j 

Proof. We can split the expectation into individual terms of the form 

r 

(2) E^^nw?^)^)) 

where each sequence . . . , i r ) belongs to [ki] x ■ ■ • x [k r ]. Let us fix such a sequence, and 
for ease of notation let us write Q'jUj instead of Qi . We shall obtain a bound for ((2]) 

and then multiply it by YYj=i — ^ ^° obtain a bound for ~& xe (z N y Ylj=i fj(^j( x )) ■ 

Since \\Uj\\* u2 < YHU \\U?\\* m < Tj < T and < E&i II^IU < 2C, < 2C, 

Lemma IH751 gives us, for each j = 1, 2, . . . , r and any £ > 0, a Bohr set Ej of complexity at 
most ((2C/£) 2 ,£) such that 

E x \Uj(x + y)- U,(x)\ 2 < 4£ 2 C 2 + 4T 4/3 £ 2/3 
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for each y G Ej. Therefore, for each subset E C Ej, 

\\U, ~ Uj * n E \\l < Ey^lUjix + y)- U 3 {x)\ 2 < 4£ 2 C 2 + 4T 4 / 3 £ 2/3 , 

where /j,e is the characteristic measure of E. In particular, if we set £ = {9 /r(2kC) r ) 3 /2 6 T 2 
(assuming, as usual, that T is much larger than C) and E = E\ fl ■ ■ -V\E r , then it is readily 
checked that 

\\Uj - Uj * n E \\ 2 < 9/r{2kC) r 

for all j = 1,2, ... ,r. Using Lemma 110.11 (i), we can therefore replace the average (J2]) by 
the expression 

r 

(3) ^xe{z N y X\[Q'j{ U 3 * y>E)){Lj{x)) 

j'=i 

at the cost of an error of at most 9/k r . 

Now E is a Bohr set B{K,£) of dimension d E < r(2C/£) 2 < 2 u r 7 (2kC) 6r T 4 /9 6 and 
density 7fi > £ dE > ^ /2 & r\2kCf r T 2 ) 2lir7{ - 2kC ^ rTi l e \ Moreover, Uj * \i E is roughly 
constant on translates of central subsets E'. More precisely, in order for Uj * \Le to be 
constant to within 9/(r(3kC) r ) on translates of E' = B(K, £'), it is enough if E' -<e/(r(3W) r ) 
E, by Lemma 12.31 Let us note for the record that in this case the dimension of E' is 
d E > = d E and the density is je' > (9/r(3kC) r 800d E ) dEr y E , which is bounded below by 
(9 10 /2 30 r 11 (3kC) 10r- 7 n6 )2 14 r 7 (2fcc) 6r T 4 /e 6 

Suppose that the linear form Li(x) is given by the formula Ylu=i c m x u- Let E" = 
B(K,£' /M), where M = maxj Ylu=i \ c ju\- As a result, E" has dimension d E n = d E and 
density j E » > M"^' 7 ^, which is at least {9 lQ /2 z0 r ll {3kC) 10r T & Mf XAr7{ ~ 2W )" rT ^ 9& . The 
reason for passing to this smaller Bohr set E" is so that it will have the following property: 
if x u e E" for every u, then J^* =1 Ci U x u G E' . 

Let B' = B[ f] ■ ■ ■ C\ B' r . Then B' is a Bohr set of dimension d E > < rc? and density 
1b' > (ep/800<i5 a! ) rQ! . Let B" be a narrowing of B' by the same factor 1/M, so i?" is a Bohr 
set of dimension d B „ = d B > and density ^ B „ > (ep/800d5 d M) rd . Finally, set D = E" n B", 
which is like a Bohr set but with "different widths in different directions" . Rather than go 
into the details of this, we merely observe that if E" = B(K, £") and B" = BiL, t"), then we 
can define a Bourgain system (D^) by setting to be B(K, /x£") (1B(L, fir"). By Lemma 
18.101 and the remark following Lemma 18.31 (which says that the dimension of a Bohr set 
B(K, p) considered as part of a Bourgain system is at most 3|i^|), this is a Bourgain system 
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of dimension d D < l2(d E „ + d B ») < 2 18 r 7 (2kC) 6r T 4 /9 6 + 2 A rd such that D = D 1 has den- 
sity lD > 2^ d z" +d B") lE „ lB „ > (ep/2 19 rf5 d M)^(^ 10 /2 39 r 11 (3A;L7) 10r T 6 M) 2l4r7 ( 2fcC ) 6rT4 /' 96 
by Lemma 18.101 

We shall cover Z^r with translates of D, and compute the expectation 

r 

for some fixed choice of Z\, . . . , z s G Zat. Now if each Xi is confined to a translate Zj + D, 
then Lj(x) is contained in some particular translate ?/j + E' n 5', by our choice of and 
-B". On this translate, * fi E is constant to within 9/(r(3kC) r ). More precisely, we can 
write Uj * fi E (x) = A % + ej(x), where 1 1 e ^ 1 1 ^ < 9/ {r{3kC) r ) for all j = 1, 2, . . . , r. Taking 
into account the fact that Yli=i \Wi Woo < 2Cj, we immediately note that |A % | < 3Cj for 
any j = 1, 2, . . . , r. It follows from Lemma [10.11 (ii) that at the cost of an error of at most 
0/k r , we can focus on evaluating 

(r \ r 

II X vi ) E ^B 1 +o^ 2 ez 2 +o • • • E XseZa+D Y[ Q'jiLjix)), 
j=i J j=i 

instead of the earlier average (jlj). We recall that each Q'j was an (e, m,,)-special average 
with base Bj and rank at least Rj with respect to P^> C Bj. In particular, for each 
j = 1,2, ... ,r, since D C E' C\ B' C Bp we find that for all but eN choices of yj G Zjy, the 
restriction of to Hj+D is equal to the restriction of u) q i to yj+D, where ^ (w) = qj{v—Vj) 
for one of at most rrij fixed values Vj G Z^r. Let us say that (yi, . . . ,y r ) is good if this is 
true for every j <r. 

Observe that as each Z\, . . . , z s runs over Z N , so does Lj(zi, . . . , z s ) for each j = 1, 2, . . . , r. 
Therefore a proportion of at least (1 — ^\ ej) > (1 — er) of all choices of (zi, . . . , z s ) G (Zjv) s 
gives rise to a good sequence (yi, . . . ,y r ). If (yi, . . . , y r ) is good, then fix a value for each 
j — 1, 2, . . . , r. Now since the e,- were required to satisfy r(2kC) r e < 9, then incurring an 
error of at most 9/k r , we can restrict our attention to 



(6) ] \ A % E^e^+xjEseaeaa+D . . . E XseZs+D JJ u- 



o=i / i=i 
for some fixed choice of ui, . . . , u r . Recall that for each j = 1, 2, . . . , r, the linear form Lj(x) 
was given by the formula 5^«=i Cj U x u . Writing /3j for the bilinear form associated with qj, 



62 



W.T. GOWERS AND J. WOLF 



we have 



, Qji-LjyZ)) — / / j C juCjy(3j(x u ,x v ). 



For each w and let us write for the bilinear form Y^j=i c juCjvfij as before. 

Set P = fl ••• D P( r \ which is part of a Bourgain system of dimension dp < 
±r 2 J2j d j < ^ kd and has density 7p > 2-^^75 ^ 2' 4r kd j' r . We shall now consider 
the rank of each qj with respect to P' — P fl where -D' -<e 4 /6(3kc) 4r D- 111 order to do 
so, we need to determine the dimension and density of P', which is the main reason we 
have been carefully keeping track of our parameters since the start of the proof. 

First note that D' is part of a Bourgain system of dimension dp>i = dp, < 2 22 r 8 d(2kC) er T 4: /6 t 
as determined earlier and has density jd' > (9 A /2 15 (3kC) 4r dD) dDn fD, which is bounded be- 
low by (ep^ 20 /2 95 r 19 c/ 2 5 d (3A;C , ) 20r T 10 M 2 ) 22V8d ( 2fcC7 ) 6rT4 / e6 . Therefore P' is part of a Bourgain 
system of dimension 

dp, < 4{d P + d D >) < 2 26 r 11 kd 2 {2kC) 6r T i /6 6 

and has density 



, fl 90 \ 2 26 r 11 kd 2 (2kC) 6r T 4 /e 6 

7P , > 2~^ dp+d ^ lP lD> > l' r I — I 

IP- lPW-1 \2"r 19 d 2 5 d (3kC) 20r T w M 2 J 



by Lemma 18.101 

Finally, we use Lemma 18.131 to make the connection between the rank of our quadratic 
phases with respect to P and P' . The lemma tells us that ap>(f3i) < (7p/7p') a p(A) f° r 
each % — 1, 2, . . . , r. 

Let X] = 7p ( , 1+r4) (#/4(3&C) r ) 16r3 . Lemma O with e = r), B x = P' and m = r tells us 
that there exists a pair (u, v) £ [s] 2 such that the bilinear form /3 UV defined above satisfies 

a P ,(p uv ) < 7^ r (800dpVr / ) d -' los ^ 3 apKA) 1/r3 +4 7 p, 2r (r/ 1/4 /7p0 1/r3 - 
for any i = 1,2, . . . ,r. To conclude the proof, note that Lemma 19.81 implies that 

r 

\E Xiezi+D E X2ez2+D . . . E Xs£Zs+D H w^Ui A»(*«.*.)+EZ =1 < 0/( 3A; cy + a P , (/3™) 1 / 4 

for any fixed linear forms <p u and any constant 0, which is at most 

6/(3kC) r + 7p r /2 (800d P Vr / )^' ^ r / 4r V(A) 1/4r3 + 4 7 - r/2 (r / 1 / 4 /7 2 ,) 1/4r3 
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and therefore bounded above by 



t \ l/4r 3 

9/(3kC) r + ^(SOOd^/rj)^' 10 ^ 3 I ^ ap(A) 1/4r3 + 2 7 p, r/2 (r/ 1/4 hl'Y 1 ^ ■ 

\1p' J 

Our choice of rj implies that the third term is no larger than the first, and that the second 
term is at most 

2 239 r 59 rf 6 5 2 d(3A;C , r T 24 M 4X 

) up{Pi) ■ 

Recalling that in ([6]) we had a pre- factor of YYj=i \j w ith each \X Vj \ < 3Cj and in fl3]) a 
factor of k r , and that ap(ft) < e _i? for every z = 1, 2, . . . , r, we obtain the final bound as 
stated. □ 



11. Proof of the main result 



Most of the work towards proving the main result was accomplished in the preceding 
section. Here we shall formally complete the proof of the following theorem. 

Theorem 11.1. Let L\, . . . , L r be a square independent system of linear forms in s vari- 
ables of Cauchy-Schwarz complexity at most 2. For every rj > 0, there exists c > with 
the following property. Let f : Z, N — > [—1, 1] be such that \\f\\u 2 < c. Then 



E xe(z N ) s Y[f( L i( x )) 



i=l 



< T]. 



Moreover, c can be taken to depend on t] in a doubly exponential fashion. 

As in |GW09a[ IGW09bj , we need to recall a well-established result that will allow us to 
neglect the quadratically uniform part of the decomposition. 

Theorem 11.2. Let fi, ■ ■ ■ , f r be functions on Z^v, and let L 1; . . . , L r be a linear system 
of Cauchy-Schwarz complexity at most 2 consisting of r forms in s variables. Then 



< m i n ll/jllc/ 3 rfll^ 



Proof of Theorem \ll.l\ Let rj > 0, and let c > be chosen in terms of r] later. Given 
/ : Zjv — > [—1, 1] with ||/||t/2 < c we first apply Theorem 18.161 with 8% = rj/(24r) to obtain 
a decomposition 

f = fi+ 9i + h, 
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where h = Y, 3 Qfuf ] with EJI^IU < 2d, EJI^II^ < T i> IMIi < 105i and 
ll^i lit/ 3 < 2<Ji. We have carefully ensured that each quadratic average Qj has rank at last 
Ri for some R\ to be chosen later. Aiming to bound 

r 

i=i 

above in absolute value by r\ for sufficiently uniform /, we first replace the first instance of 
/ in the product by g% + hi. The product involving g x yields an error term of 10<5i since all 
the remaining factors have norm bounded by 1, while the product involving h\ yields 
an error of 28\ by Theorem 111.21 above . Our choice of 8\ implies that the sum of these two 
errors is at most rj/(2r). 

Now we apply Theorem 18.161 again, this time with 8 2 = r)/(48rCi), to obtain a decom- 
position 

f = 12+92 + h 2 , 

where f 2 = EjQfuf with II^IU < 2C 2 , EjWfW^ < T 2 , \\g 2 \\i < 108 2 and 
II h 2 \\jj3 < 28 2 . When replacing the first instance of / in the new product 

r 

E. e( z^/i(£i(x))n/(^)) 

i=2 

with g 2 + h 2 , the product involving g 2 now contributes an error term of at most 205 2 Ci 
(since ||/i||oo < 2C\). By Theorem 111.21 it follows that the contribution from the product 
involving h 2 is bounded above by A8 2 Ci. Therefore the total error incurred is at most 
2A5 2 Ci, which is at most rj/(2r) by our choice of 8 2 . 

When we come to apply Theorem 18. 161 to the kth instance of / in the original product, 
we need to do so with 8k satisfying 12 ■ 2 k ~ l 8kC\ . . . Ck~i < r]/(2r) for k = 2, . . . ,r. This 
ensures that up to an error of rj/2, it suffices to consider the product 

r 

E xe(z N yY[fj(Lj(x)), 

where each function fj is quadratically structured. The key estimate, Proposition 110.21 
with 9 = 77/20, now implies that 

r 

Ke(z N )s J] /,(L,(x))| < 7^/4 + X e- R,Ar \ 
i=i 
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where 



i r 22 k 2 d' l (2kC) L2r T >i /r) L 

■ 

x(v) 



2^r 59 d 6 5 2d (3kC) 50r T 24 M £ 



with C = maxj Cj,T = ma.xj Tj, R = miiij Rj, d = max.,- dj, k = max.,- kj, 7' = mirij 7'-, p = 

mirij pj and e = max.,- €j. Choosing Rj large enough at each stage, we will be able to force 

xe --R/4r 3 < ^ 

The argument is essentially complete; it remains to check that the dependence we obtain 
is doubly exponential. First, note that every application of Theorem 18 . 161 returns Cj, dj, kj 
as well as pj and (the upper bound on) 6j as parameters that are polynomial in Sj, and 
hence polynomial in rj. Only Tj is exponential in rj. 

Also, in order to apply Proposition 110.21 we needed to assume that the parameters €j 
satisfy r(2kC) r e < rj/20. This means that the density of the progression Pj used in 
the j th decomposition is at least (r)p/2 l7 r(2kC) r d 2 k5 d ) dh , which does not affect the doubly 
exponential nature of xiv)- Hence it is possible to choose Rj to be an exponential function 
of T) at each stage. By Theorem 18.161 this is possible provided that H/H172 < c, where c is 
bounded above by 

/ X7 3 3 \ 2 65k md 12k k 15k T 4 C 2 /S (i 

e -2 15k d 7k k Gk R ( 6 P € 



2 102k k 8 d 8 5 2d C 3 T\ 

where 5 = min,- Sj and R was chosen to satisfy x e ~ R ^ ri < v/^- More precisely, the average 
K x YYj=i f(Lj{x)) is less than 77 provided that c is at most 

^54^3 x 2 llsk r 25 k 8k d llk {2kC) 12r T s / v 12 , p?> x 2 65k md 12k k 15k T 4 C 2 /8 r > 



2 466 Ar 62 d 8 5 3<2 (3fcC) 52r T 24 M 4 ) \2 102k k 8 d 8 5 2d C 3 T\ 

With m and r being fixed constants, M being a constant depending on the coefficients of the 
linear forms, C, 5, d, k and p depending polynomially on rj and T depending exponentially 
on r/, this bound on c is indeed doubly exponential in rj as claimed. 

□ 
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